Hey James, thanks for putting this up.

Inline:

> The suggestion is to make this part of Flight as an
> optional feature, rather than Flight SQL due to its applicability outside
> of just database access.

Which uses do you see? I see statefulness as a general antipattern here, so I'm 
wary of introducing it beyond where we need it. 

> - The Flight client supplies a New-Session header which has key-value pairs
> for initial session options. This header can be applied to any RPC call,
> but logically should be the first one the client makes.

Handshake already effectively serves as this RPC - maybe we could extend it? (I 
also see Handshake as an antipattern because it's a stateful auth mechanism.)

Should the session timeout/be on a lease? (gRPC doesn't really give the server 
a way to track the persistence of a particular client connection.)

> It's a bit asymmetric that creating a new session is done by applying a
> header, but closing a session is an RPC call. This was so that session
> creation doesn't introduce another round trip before the first real data
> request. If there's a way to batch RPC calls it might be better to make
> session creation an RPC call.

Is this a worrisome amount of overhead? 

Unfortunately gRPC doesn't batch RPC calls, but RPC calls on the same client 
generally share the same TCP connection (modulo load balancing behavior, but 
presumably that is not enabled if you want persistent sessions).

On the implementation side, I'd like to avoid baking this in too deeply if at 
all possible. Ideally it'd be implemented entirely as middleware, possibly 
making use of an interface so applications can override the session storage 
(hashtable, Redis, etcd, etc.)

> Just to chime in on this, one thing I'm curious about is whether there
> will be support for user-defined catalog/schema hierarchy depth?

Gavin, for ADBC we discussed adding a delimiter to the catalog name to handle 
this case - maybe we can handle this by adding a property for the delimiter to 
SqlInfo?

> https://github.com/influxdata/influxdb_iox/issues/6102

Andrew, do we need to look into adding more metadata to indicate different 
query languages? (It's quite a shame that we named this Flight SQL at this 
point...) 

On Thu, Dec 1, 2022, at 08:49, Andrew Lamb wrote:
> Sorry for the late reply -- thank you James and David for this discussion.
>
> I agree that adding Catalog support would be a valuable addition to Flight
> SQL, and it recently came up as we begin to implement Flight SQL in
> InfluxDB IOx [1].
>
>> - A standard URI scheme for Flight SQL that can be used by multiple
> client APIs (JDBC, ADBC, etc.)
>
> I agree this would be very valuable, along with a standard way (ideally
> with HTTP headers) to send this information as part of the FlightSQL gRPC
> requests.
>
>> I'd suggest we define session management features explicitly in Flight
> (while being optional).
>
> I agree it is critical that server-side state is not required to implement
> FlightSQL. Stateful connections would likely complicate deploying FlightSQL
> in distributed systems. I suggest it should be possible to implement any
> session management features by sending the entire session state with the
> request, if desired.
>
> I don't have a strong opinion about the merits of including explicit
> session management features in FlightSQL. It seems to me that keeping the
> API surface of FlightSQL minimal and implementation flexibility maximal
> should be the default. However, if JDBC/ODBC driver compatibility would be
> improved with explicit state management APIs, then adding them to FlightSQL
> seems like a good idea to me.
>
> Thanks again -- it is amazing to hit some issue in design and then find out
> the Arrow community is already hard at work on a solution.
>
> Andrew
>
> [1] https://github.com/influxdata/influxdb_iox/issues/6102
>
> On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ray.gavi...@gmail.com> wrote:
>
>> Just to chime in on this, one thing I'm curious about is whether there
>> will be support for user-defined catalog/schema hierarchy depth?
>>
>> This comment that James made does seem reasonable to me
>> > scheme://<host>:<port>/path-1/path-2/.../path-n
>>
>> Trino/Presto does a similar thing (jdbc:trino://localhost:8080/tpch/sf1)
>>
>> At Hasura, what we do is have an alias "FullyQualifiedName" which is
>> just "Array<String>"
>> and the identifier to some element in a data source is always
>> fully-qualified:
>>
>> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema
>>
>> ["postgres_1", "db1", "schema2", "my_table", "col_a"]
>> ["mongo", "db1",  "collection_a", "field_a"]
>> ["csv_adapter", "myfile.csv", "col_x"]
>>
>> On Wed, Nov 30, 2022 at 6:31 PM James Duong
>> <jam...@bitquilltech.com.invalid> wrote:
>> >
>> > Our current convention of sending connection properties as headers with
>> > every request has the benefit of making statefulness optional, but has
>> the
>> > drawback of sending redundant, unused properties on requests after the
>> > first, which increases the payload size unnecessarily.
>> >
>> > I'd suggest we define session management features explicitly in Flight
>> > (while being optional). The suggestion is to make this part of Flight as
>> an
>> > optional feature, rather than Flight SQL due to its applicability outside
>> > of just database access.
>> >
>> > Creating a session:
>> > - The Flight client supplies a New-Session header which has key-value
>> pairs
>> > for initial session options. This header can be applied to any RPC call,
>> > but logically should be the first one the client makes.
>> > - The server should send a Set-Cookie header back containing some
>> > server-side representation of the session that the client can use in
>> > subsequent requests.
>> > - The path specified in the URI is sent as a "Catalog" session option.
>> >
>> > Modifying session options:
>> > - A separate RPC call that takes in a Stream<string, string> representing
>> > each session option that is being modified and returns a stream of
>> statuses
>> > to indicate if the setting change was accepted.
>> > - This RPC call is only valid when the Cookie header is used.
>> > - It is up to the server to define if a failed session property change is
>> > fatal or if other properties can continue to be set.
>> >
>> > Closing a session:
>> > - A separate RPC call that tells the server to drop the session specified
>> > by the Cookie header.
>> >
>> > Notes:
>> > A Flight SQL client would check if session management RPCs are supported
>> > through a new GetSqlInfo property. A Flight client doesn't have a way to
>> do
>> > this generically, but there could be an application-specific RPC or
>> header
>> > that reports this metadata.
>> >
>> > The O/JDBC and ADBC drivers would need to be updated to programmatically
>> > check for session management RPCs. If unsupported, then use the old
>> > behavior of sending all properties as headers with each request. If
>> > supported, make use of the New-Session header and drop the session when
>> > closing the client-side connection.
>> >
>> > It's a bit asymmetric that creating a new session is done by applying a
>> > header, but closing a session is an RPC call. This was so that session
>> > creation doesn't introduce another round trip before the first real data
>> > request. If there's a way to batch RPC calls it might be better to make
>> > session creation an RPC call.
>> >
>> > On Tue, Nov 22, 2022 at 3:16 PM David Li <lidav...@apache.org> wrote:
>> >
>> > > It sounds reasonable - then there are three points:
>> > >
>> > > - A standard URI scheme for Flight SQL that can be used by multiple
>> client
>> > > APIs (JDBC, ADBC, etc.)
>> > > - A standard scheme for session data (likely header/cookie-based)
>> > > - A mapping from URI parameters and fields to session data
>> > >
>> > >
>> > >
>> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote:
>> > > > Just following up on this and if there are any thoughts.
>> > > >
>> > > > The purpose would be to standardize how we specify access to some
>> named
>> > > > logical grouping of data. This would make it easy to model
>> catalog/schema
>> > > > semantics in Flight SQL.
>> > > >
>> > > > Having this be part of the connection URI makes it similar to
>> specifying
>> > > a
>> > > > resource in an HTTP URL (ie an endpoint) which should make it easy
>> for
>> > > end
>> > > > users to work with and modify.
>> > > >
>> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong <jam...@bitquilltech.com
>> >
>> > > wrote:
>> > > >
>> > > >> As for surfacing catalogs itself, perhaps we allow the URI take in a
>> > > path
>> > > >> and treat that as a way of specifying a multi-level resource that
>> which
>> > > the
>> > > >> FlightClient is connecting to:
>> > > >>
>> > > >> eg a connection URI of the form:
>> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n
>> > > >>
>> > > >> The FlightClient could send this path as either a header or a
>> session
>> > > >> property (with a neutral name like 'resource-path'). Flight SQL
>> > > Producers
>> > > >> could interpret this as a catalog or schema.
>> > > >> eg
>> > > >> grpc://<host>:<port>/catalog/schema
>> > > >>
>> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <j...@juxt.pro>
>> wrote:
>> > > >>
>> > > >>> Sounds good to me.
>> > > >>>
>> > > >>> > Are you interested in writing up a (sketch of a) proposal?
>> > > >>>
>> > > >>> Yep, can do - I'm OoO over the next couple of weeks so might be a
>> bit
>> > > >>> intermittent.
>> > > >>>
>> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <lidav...@apache.org>
>> wrote:
>> > > >>>
>> > > >>> > Hey James H.,
>> > > >>> >
>> > > >>> > That would make sense to me. So it sounds like we'd want
>> > > >>> >
>> > > >>> > - Formal specification of using cookies/headers to mark a
>> 'session'
>> > > (I
>> > > >>> > guess this will be a little inconsistent with transactions,
>> though)
>> > > >>> > - Adding RPCs to query session values
>> > > >>> > - Adding RPCs to set session values
>> > > >>> > - Listing standard values and types
>> > > >>> >
>> > > >>> > Some things may require more consideration, e.g. transaction
>> > > isolation
>> > > >>> > might be better off as part of the transaction RPCs than an
>> ambient
>> > > >>> > property. Are you interested in writing up a (sketch of a)
>> proposal?
>> > > >>> >
>> > > >>> > -David
>> > > >>> >
>> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote:
>> > > >>> > > Similarly, we're also currently considering how best to
>> implement
>> > > >>> some of
>> > > >>> > > the SQL standard session variables in our Flight SQL server -
>> > > things
>> > > >>> like
>> > > >>> > > current transaction isolation level, access mode, time zone
>> etc,
>> > > which
>> > > >>> > seem
>> > > >>> > > to have similar properties to the (traditional) connection's
>> > > current
>> > > >>> > > catalog. We'd (perhaps naively) looked at solutions involving
>> the
>> > > >>> Flight
>> > > >>> > > client's `ClientCookieMiddleware`, but might these have
>> > > standardised
>> > > >>> > > support within Flight SQL itself eventually?
>> > > >>> > >
>> > > >>> > > Cheers,
>> > > >>> > >
>> > > >>> > > James
>> > > >>> > >
>> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <lidav...@apache.org>
>> wrote:
>> > > >>> > >
>> > > >>> > >> I think having better support for this makes sense, but
>> perhaps we
>> > > >>> can
>> > > >>> > >> find a way to make it not tied to the connection itself? For
>> > > >>> instance,
>> > > >>> > in
>> > > >>> > >> the same way transactions were implemented (as a handle). Or
>> > > rather,
>> > > >>> > >> instead of adding connection statefulness to Flight RPC, I'd
>> > > rather
>> > > >>> try
>> > > >>> > to
>> > > >>> > >> work within the gRPC/RPC paradigm.
>> > > >>> > >>
>> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote:
>> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the
>> concept of
>> > > >>> > catalogs
>> > > >>> > >> as
>> > > >>> > >> > containers of database schemas. Users can usually specify an
>> > > >>> initial
>> > > >>> > >> > catalog during the connection process, list catalogs, and
>> > > sometimes
>> > > >>> > >> change
>> > > >>> > >> > catalogs during the session.
>> > > >>> > >> >
>> > > >>> > >> > Currently catalog support in Flight SQL is somewhat
>> limited. The
>> > > >>> > protocol
>> > > >>> > >> > provides a way to list catalogs as well as metadata in
>> > > SqlTypeInfo
>> > > >>> for
>> > > >>> > >> > reporting how catalogs are supported from a syntax
>> perspective.
>> > > >>> > >> >
>> > > >>> > >> > ODBC and JDBC provide API calls for changing the catalog.
>> > > >>> > Additionally,
>> > > >>> > >> > Flight SQL doesn't really provide the concept of "initial"
>> > > >>> connection
>> > > >>> > >> > properties (such as a starting catalog) since Flight itself
>> is
>> > > >>> > stateless
>> > > >>> > >> > from a connection perspective.
>> > > >>> > >> >
>> > > >>> > >> > To support catalogs properly, I'd imagine we need to make
>> some
>> > > >>> > changes to
>> > > >>> > >> > the Flight SQL protocol:
>> > > >>> > >> > - Introduce the concept of connection-time properties
>> (perhaps
>> > > an
>> > > >>> > >> optional
>> > > >>> > >> > RPC for Flight SQL applications that need this)
>> > > >>> > >> > - Related to the above, expand the connection URL and Java
>> > > builder
>> > > >>> to
>> > > >>> > >> allow
>> > > >>> > >> > arbitrary application-specific properties.
>> > > >>> > >> > - Add optional RPCs for changing the catalog and relevant
>> error
>> > > >>> codes
>> > > >>> > if
>> > > >>> > >> > this is not permitted.
>> > > >>> > >> >
>> > > >>> > >> >
>> > > >>> > >> > --
>> > > >>> > >> >
>> > > >>> > >> > *James Duong*
>> > > >>> > >> > Lead Software Developer
>> > > >>> > >> > Bit Quill Technologies Inc.
>> > > >>> > >> > Direct: +1.604.562.6082 | jam...@bitquilltech.com
>> > > >>> > >> > https://www.bitquilltech.com
>> > > >>> > >> >
>> > > >>> > >> > This email message is for the sole use of the intended
>> > > recipient(s)
>> > > >>> > and
>> > > >>> > >> may
>> > > >>> > >> > contain confidential and privileged information.  Any
>> > > unauthorized
>> > > >>> > >> review,
>> > > >>> > >> > use, disclosure, or distribution is prohibited.  If you are
>> not
>> > > the
>> > > >>> > >> > intended recipient, please contact the sender by reply
>> email and
>> > > >>> > destroy
>> > > >>> > >> > all copies of the original message.  Thank you.
>> > > >>> > >>
>> > > >>> > >
>> > > >>> > >
>> > > >>> > > --
>> > > >>> > > *James Henderson*
>> > > >>> > > XTDB Developer at *JUXT*
>> > > >>> > > Email j...@juxt.pro
>> > > >>> > > Website https://juxt.pro
>> > > >>> > >
>> > > >>> > > [image: photo]
>> > > >>> >
>> > > >>>
>> > > >>>
>> > > >>> --
>> > > >>> *James Henderson*
>> > > >>> XTDB Developer at *JUXT*
>> > > >>> Email j...@juxt.pro
>> > > >>> Website https://juxt.pro
>> > > >>>
>> > > >>> [image: photo]
>> > > >>>
>> > > >>
>> > > >>
>> > > >> --
>> > > >>
>> > > >> *James Duong*
>> > > >> Lead Software Developer
>> > > >> Bit Quill Technologies Inc.
>> > > >> Direct: +1.604.562.6082 | jam...@bitquilltech.com
>> > > >> https://www.bitquilltech.com
>> > > >>
>> > > >> This email message is for the sole use of the intended recipient(s)
>> and
>> > > >> may contain confidential and privileged information.  Any
>> unauthorized
>> > > >> review, use, disclosure, or distribution is prohibited.  If you are
>> not
>> > > the
>> > > >> intended recipient, please contact the sender by reply email and
>> destroy
>> > > >> all copies of the original message.  Thank you.
>> > > >>
>> > > >
>> > > >
>> > > > --
>> > > >
>> > > > *James Duong*
>> > > > Lead Software Developer
>> > > > Bit Quill Technologies Inc.
>> > > > Direct: +1.604.562.6082 | jam...@bitquilltech.com
>> > > > https://www.bitquilltech.com
>> > > >
>> > > > This email message is for the sole use of the intended recipient(s)
>> and
>> > > may
>> > > > contain confidential and privileged information.  Any unauthorized
>> > > review,
>> > > > use, disclosure, or distribution is prohibited.  If you are not the
>> > > > intended recipient, please contact the sender by reply email and
>> destroy
>> > > > all copies of the original message.  Thank you.
>> > >
>> >
>> >
>> > --
>> >
>> > *James Duong*
>> > Lead Software Developer
>> > Bit Quill Technologies Inc.
>> > Direct: +1.604.562.6082 | jam...@bitquilltech.com
>> > https://www.bitquilltech.com
>> >
>> > This email message is for the sole use of the intended recipient(s) and
>> may
>> > contain confidential and privileged information.  Any unauthorized
>> review,
>> > use, disclosure, or distribution is prohibited.  If you are not the
>> > intended recipient, please contact the sender by reply email and destroy
>> > all copies of the original message.  Thank you.
>>

Reply via email to