[ https://issues.apache.org/jira/browse/FLINK-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16864356#comment-16864356 ]
Xuefu Zhang commented on FLINK-12771: ------------------------------------- Hi [~dawidwys], Re: #2, communicating with Hive metastore is the main reason for having HiveCatalog, but it doesn't necessarily mean that's all it can do. In Hive, a user can create temporary tables/functions via DDL, which requires no communication to Hive metastore. Such session specific objects are stored in user session (or client). (In fact, Hive metastore provided a type of client that has this functionality built-in.) Those objects share the same namespace with the persistent objects stored in Hive metastore. I'd argue that Flink's temp objects such as inline tables have the same nature, and can be handled in a similar way. I agree this isn't the only solution. Storing them in a dedicated, in memory catalog is one possibility, and having a session-specific structure holding them is another. We feel extending the capability of HiveCatalog to share and store them having greater advantages. Re: #3, persistence is just one part of the store, but we have decoupled the types of the object to be stored and the nature of the catalog. This happened when we hosted both Flink tables and Hive tables via HiveCatalog. That is, a catalog may store any type of tables. The work here is just an extension to that. Along the same idea, we shouldn't stop a user to define either a Hive table, a generic table, or an inline table in an in-memory catalog. If a hive catalog is correctly define in in-memory catalog, Flink has no problem read/write that table. The difference is that the table definition doesn't survive beyond user session. I understand there might be a perception that anything registered in HiveCatalog should be persisted. We know this isn't true for Hive's temporary tables and functions. I think we just need to educate user that certain tables (such as inline tables) are temporary in nature and valid only in current session, which was already true before HiveCatalog is introduced. Please let me know if there are more questions or comments. > Support ConnectorCatalogTable in HiveCatalog > -------------------------------------------- > > Key: FLINK-12771 > URL: https://issues.apache.org/jira/browse/FLINK-12771 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive > Reporter: Bowen Li > Assignee: Bowen Li > Priority: Major > Labels: pull-request-available > Fix For: 1.9.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently {{HiveCatalog}} does not support {{ConnectorCatalogTable}}. There's > a major drawback on this when it comes to real use cases, that is when Table > API users set a {{HiveCatalog}} as their default catalog (which is very > likely), they cannot create or use any inline table sources/sinks with their > default catalog any more. It's really inconvenient for Table API users to use > Flink for exploration, experiment, and production. > There are several workaround in this case. E.g. users have to switch their > default catalog, but that misses our original intention of having a default > {{HiveCatalog}}; or users can register their inline source/sinks to Flink's > default catalog which is a in memory catalog, but that not only require users > to type full path of a table but also requires users to be aware of the > Flink's default catalog, default db, and their names. In short, none of the > workaround seems to be reasonable and user friendly. > From another perspective, Hive has the concept of temporary tables that are > stored in memory of Hive metastore client and are removed when client is shut > down. In Flink, {{ConnectorCatalogTable}} can be seen as a type of > session-based temporary table, and {{HiveCatalog}} (potentially any catalog > implementations) can store it in memory. By introducing the concept of temp > table, we could greatly eliminate frictions for users and raise their > experience and productivity. > Thus, we propose adding a simple in memory map for {{ConnectorCatalogTable}} > in {{HiveCatalog}} to allow users create and use inline source/sink when > their default catalog is a {{HiveCatalog}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)