Hello devs,

I would like to open a discussion about persistence possibilitis for the SQL 
Gateway. At Cloudera, we are happy to see the work already done on this project 
and looking for ways to utilize it on our platform as well, but currently it 
lacks some features that would be essential in our case, where we could help 
out.

I am not sure if any thought went into gateway persistence specifics already, 
and this feature could be implemented in fundamentally differnt ways, so I 
think the frist step could be to agree on the basics.

First, in my opinion, persistence should be an optional feature of the gateway, 
that can be enabled if desired. There can be a lot of implementation details, 
but there can be some major directions to follow:

- Utilize Hive catalog: The Hive catalog can already be used to have 
persistenct meta-objects, so the crucial thing that would be missing in this 
case is other catalogs. Personally, I would not pursue this option, because in 
my opinion it would limit the usability of this feature too much.
- Serialize the session as is: Saving the whole session (or its context) [1] as 
is to durable storage, so it can be kept and picked up again.
- Serialize the required elements (catalogs, tables, functions, etc.), not 
necessarily as a whole: The main point here would be to serialize a different 
object, so the persistent data will not be that sensitive to changes of the 
session (or its context). There can be numerous factors here, like try to keep 
the model close to the session itself, so the boilerplate required for the 
mapping can be kept to minimal, or focus on saving what is actually necessary, 
making the persistent storage more portable.

WDYT?

Cheers,
F

[1] 
https://github.com/apache/flink/blob/master/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/session/Session.java

Reply via email to