Hi James, Thanks for sharing your plan! I will wait for an update.
-- kou In <ch2pr13mb383114d500d25e30a2166d6ae3...@ch2pr13mb3831.namprd13.prod.outlook.com> "Re: DISCUSS: [FlightSQL] Catalog support" on Tue, 14 Feb 2023 04:53:07 +0000, James Duong <james.du...@improving.com.INVALID> wrote: > Hi Sutou, > > I saw your PostgreSQL project and thought it was quite interesting, > especially given the number of PostgreSQL-compatible databases. > > Paul Nienaber will be picking up implementation of the catalog feature going > forward and can provide an update > > Get Outlook for Android<https://aka.ms/AAb9ysg> > > ________________________________ > From: Sutou Kouhei <k...@clear-code.com> > Sent: Thursday, February 9, 2023, 22:25 > To: dev@arrow.apache.org <dev@arrow.apache.org> > Subject: Re: DISCUSS: [FlightSQL] Catalog support > > Hi James > > Is there any progress of this? > > I'm developing a Flight SQL adapter for PostgreSQL: > https://github.com/apache/arrow-flight-sql-postgresql > > I want to implement session feature for it because opening > a session in PostgreSQL is expensive. PostgreSQL uses one > process per session. If we open and close a session for > each Flight SQL call, we need to start one process for each > Flight SQL call. > > I noticed that the current Flight SQL specification doesn't > provide the standard session support. So I'm interesting in > this discussion. > > Background: https://github.com/apache/arrow-flight-sql-postgresql/issues/13 > > > Thanks, > -- > kou > > In > > <ch2pr13mb3831b71269c257469631fb5de3...@ch2pr13mb3831.namprd13.prod.outlook.com> > "Re: DISCUSS: [FlightSQL] Catalog support" on Mon, 12 Dec 2022 18:12:06 > +0000, > James Duong <james.du...@improving.com.INVALID> wrote: > >> Hi David, >> >> I've written up the URI parsing in C++ and started adding session management >> messages. I'm also planning on having the ClientCookieMiddlewareFactory be >> able to report if sessions are enabled on the server. >> >> I (or another developer) will send an update once those features are ready >> for demo. >> ________________________________ >> From: David Li <lidav...@apache.org> >> Sent: December 12, 2022 10:07 AM >> To: dev@arrow.apache.org <dev@arrow.apache.org> >> Subject: Re: DISCUSS: [FlightSQL] Catalog support >> >> Following up here, James are you interested in putting up a draft PR for the >> Flight SQL URI format and for session management? >> >> The Flight SQL URI format would then also cover Andrew's use case. And if >> someone wants to draw up a PR to the JDBC driver to enable arbitrary >> properties, I can review that too. >> >> On Sat, Dec 3, 2022, at 05:38, Andrew Lamb wrote: >>>> Andrew, do we need to look into adding more metadata to indicate >>> different query languages? (It's quite a shame that we named this Flight >>> SQL at this point...) >>> >>> TDLR is I don't think trying to explicitly support languages other than SQL >>> in FlightSQL is a good idea. Among other reasons, the JDBC / ODBC drivers, >>> which mostly assume SQL, are one of the key features of FlightSQL, and they >>> are likely not as useful for non SQL. I can see the argument to support for >>> substrait plans, and it will be interesting to see what use cases benefit >>> from that support. >>> >>> What would make our life easier would be some standard way to pass >>> application specific key/value pairs from the JDBC driver to a flight SQL >>> backend with each request (perhaps via gRPC headers). This would allow >>> passing configuration parameters that were not envisioned in the spec, from >>> end user (of the JDBC driver) all the way to our backend. >>> >>> Thanks again for driving this forward, >>> Andrew >>> >>> On Thu, Dec 1, 2022 at 7:11 PM David Li <lidav...@apache.org> wrote: >>> >>>> Hey James, thanks for putting this up. >>>> >>>> Inline: >>>> >>>> > The suggestion is to make this part of Flight as an >>>> > optional feature, rather than Flight SQL due to its applicability outside >>>> > of just database access. >>>> >>>> Which uses do you see? I see statefulness as a general antipattern here, >>>> so I'm wary of introducing it beyond where we need it. >>>> >>>> > - The Flight client supplies a New-Session header which has key-value >>>> pairs >>>> > for initial session options. This header can be applied to any RPC call, >>>> > but logically should be the first one the client makes. >>>> >>>> Handshake already effectively serves as this RPC - maybe we could extend >>>> it? (I also see Handshake as an antipattern because it's a stateful auth >>>> mechanism.) >>>> >>>> Should the session timeout/be on a lease? (gRPC doesn't really give the >>>> server a way to track the persistence of a particular client connection.) >>>> >>>> > It's a bit asymmetric that creating a new session is done by applying a >>>> > header, but closing a session is an RPC call. This was so that session >>>> > creation doesn't introduce another round trip before the first real data >>>> > request. If there's a way to batch RPC calls it might be better to make >>>> > session creation an RPC call. >>>> >>>> Is this a worrisome amount of overhead? >>>> >>>> Unfortunately gRPC doesn't batch RPC calls, but RPC calls on the same >>>> client generally share the same TCP connection (modulo load balancing >>>> behavior, but presumably that is not enabled if you want persistent >>>> sessions). >>>> >>>> On the implementation side, I'd like to avoid baking this in too deeply if >>>> at all possible. Ideally it'd be implemented entirely as middleware, >>>> possibly making use of an interface so applications can override the >>>> session storage (hashtable, Redis, etcd, etc.) >>>> >>>> > Just to chime in on this, one thing I'm curious about is whether there >>>> > will be support for user-defined catalog/schema hierarchy depth? >>>> >>>> Gavin, for ADBC we discussed adding a delimiter to the catalog name to >>>> handle this case - maybe we can handle this by adding a property for the >>>> delimiter to SqlInfo? >>>> >>>> > https://github.com/influxdata/influxdb_iox/issues/6102 >>>> >>>> Andrew, do we need to look into adding more metadata to indicate different >>>> query languages? (It's quite a shame that we named this Flight SQL at this >>>> point...) >>>> >>>> On Thu, Dec 1, 2022, at 08:49, Andrew Lamb wrote: >>>> > Sorry for the late reply -- thank you James and David for this >>>> discussion. >>>> > >>>> > I agree that adding Catalog support would be a valuable addition to >>>> Flight >>>> > SQL, and it recently came up as we begin to implement Flight SQL in >>>> > InfluxDB IOx [1]. >>>> > >>>> >> - A standard URI scheme for Flight SQL that can be used by multiple >>>> > client APIs (JDBC, ADBC, etc.) >>>> > >>>> > I agree this would be very valuable, along with a standard way (ideally >>>> > with HTTP headers) to send this information as part of the FlightSQL gRPC >>>> > requests. >>>> > >>>> >> I'd suggest we define session management features explicitly in Flight >>>> > (while being optional). >>>> > >>>> > I agree it is critical that server-side state is not required to >>>> implement >>>> > FlightSQL. Stateful connections would likely complicate deploying >>>> FlightSQL >>>> > in distributed systems. I suggest it should be possible to implement any >>>> > session management features by sending the entire session state with the >>>> > request, if desired. >>>> > >>>> > I don't have a strong opinion about the merits of including explicit >>>> > session management features in FlightSQL. It seems to me that keeping the >>>> > API surface of FlightSQL minimal and implementation flexibility maximal >>>> > should be the default. However, if JDBC/ODBC driver compatibility would >>>> be >>>> > improved with explicit state management APIs, then adding them to >>>> FlightSQL >>>> > seems like a good idea to me. >>>> > >>>> > Thanks again -- it is amazing to hit some issue in design and then find >>>> out >>>> > the Arrow community is already hard at work on a solution. >>>> > >>>> > Andrew >>>> > >>>> > [1] https://github.com/influxdata/influxdb_iox/issues/6102 >>>> > >>>> > On Wed, Nov 30, 2022 at 7:17 PM Gavin Ray <ray.gavi...@gmail.com> wrote: >>>> > >>>> >> Just to chime in on this, one thing I'm curious about is whether there >>>> >> will be support for user-defined catalog/schema hierarchy depth? >>>> >> >>>> >> This comment that James made does seem reasonable to me >>>> >> > scheme://<host>:<port>/path-1/path-2/.../path-n >>>> >> >>>> >> Trino/Presto does a similar thing (jdbc:trino://localhost:8080/tpch/sf1) >>>> >> >>>> >> At Hasura, what we do is have an alias "FullyQualifiedName" which is >>>> >> just "Array<String>" >>>> >> and the identifier to some element in a data source is always >>>> >> fully-qualified: >>>> >> >>>> >> https://github.com/hasura/graphql-engine/tree/master/dc-agents#schema >>>> >> >>>> >> ["postgres_1", "db1", "schema2", "my_table", "col_a"] >>>> >> ["mongo", "db1", "collection_a", "field_a"] >>>> >> ["csv_adapter", "myfile.csv", "col_x"] >>>> >> >>>> >> On Wed, Nov 30, 2022 at 6:31 PM James Duong >>>> >> <jam...@bitquilltech.com.invalid> wrote: >>>> >> > >>>> >> > Our current convention of sending connection properties as headers >>>> with >>>> >> > every request has the benefit of making statefulness optional, but has >>>> >> the >>>> >> > drawback of sending redundant, unused properties on requests after the >>>> >> > first, which increases the payload size unnecessarily. >>>> >> > >>>> >> > I'd suggest we define session management features explicitly in Flight >>>> >> > (while being optional). The suggestion is to make this part of Flight >>>> as >>>> >> an >>>> >> > optional feature, rather than Flight SQL due to its applicability >>>> outside >>>> >> > of just database access. >>>> >> > >>>> >> > Creating a session: >>>> >> > - The Flight client supplies a New-Session header which has key-value >>>> >> pairs >>>> >> > for initial session options. This header can be applied to any RPC >>>> call, >>>> >> > but logically should be the first one the client makes. >>>> >> > - The server should send a Set-Cookie header back containing some >>>> >> > server-side representation of the session that the client can use in >>>> >> > subsequent requests. >>>> >> > - The path specified in the URI is sent as a "Catalog" session option. >>>> >> > >>>> >> > Modifying session options: >>>> >> > - A separate RPC call that takes in a Stream<string, string> >>>> representing >>>> >> > each session option that is being modified and returns a stream of >>>> >> statuses >>>> >> > to indicate if the setting change was accepted. >>>> >> > - This RPC call is only valid when the Cookie header is used. >>>> >> > - It is up to the server to define if a failed session property >>>> change is >>>> >> > fatal or if other properties can continue to be set. >>>> >> > >>>> >> > Closing a session: >>>> >> > - A separate RPC call that tells the server to drop the session >>>> specified >>>> >> > by the Cookie header. >>>> >> > >>>> >> > Notes: >>>> >> > A Flight SQL client would check if session management RPCs are >>>> supported >>>> >> > through a new GetSqlInfo property. A Flight client doesn't have a way >>>> to >>>> >> do >>>> >> > this generically, but there could be an application-specific RPC or >>>> >> header >>>> >> > that reports this metadata. >>>> >> > >>>> >> > The O/JDBC and ADBC drivers would need to be updated to >>>> programmatically >>>> >> > check for session management RPCs. If unsupported, then use the old >>>> >> > behavior of sending all properties as headers with each request. If >>>> >> > supported, make use of the New-Session header and drop the session >>>> when >>>> >> > closing the client-side connection. >>>> >> > >>>> >> > It's a bit asymmetric that creating a new session is done by applying >>>> a >>>> >> > header, but closing a session is an RPC call. This was so that session >>>> >> > creation doesn't introduce another round trip before the first real >>>> data >>>> >> > request. If there's a way to batch RPC calls it might be better to >>>> make >>>> >> > session creation an RPC call. >>>> >> > >>>> >> > On Tue, Nov 22, 2022 at 3:16 PM David Li <lidav...@apache.org> wrote: >>>> >> > >>>> >> > > It sounds reasonable - then there are three points: >>>> >> > > >>>> >> > > - A standard URI scheme for Flight SQL that can be used by multiple >>>> >> client >>>> >> > > APIs (JDBC, ADBC, etc.) >>>> >> > > - A standard scheme for session data (likely header/cookie-based) >>>> >> > > - A mapping from URI parameters and fields to session data >>>> >> > > >>>> >> > > >>>> >> > > >>>> >> > > On Tue, Nov 22, 2022, at 17:45, James Duong wrote: >>>> >> > > > Just following up on this and if there are any thoughts. >>>> >> > > > >>>> >> > > > The purpose would be to standardize how we specify access to some >>>> >> named >>>> >> > > > logical grouping of data. This would make it easy to model >>>> >> catalog/schema >>>> >> > > > semantics in Flight SQL. >>>> >> > > > >>>> >> > > > Having this be part of the connection URI makes it similar to >>>> >> specifying >>>> >> > > a >>>> >> > > > resource in an HTTP URL (ie an endpoint) which should make it easy >>>> >> for >>>> >> > > end >>>> >> > > > users to work with and modify. >>>> >> > > > >>>> >> > > > On Fri, Nov 18, 2022 at 3:17 PM James Duong < >>>> jam...@bitquilltech.com >>>> >> > >>>> >> > > wrote: >>>> >> > > > >>>> >> > > >> As for surfacing catalogs itself, perhaps we allow the URI take >>>> in a >>>> >> > > path >>>> >> > > >> and treat that as a way of specifying a multi-level resource that >>>> >> which >>>> >> > > the >>>> >> > > >> FlightClient is connecting to: >>>> >> > > >> >>>> >> > > >> eg a connection URI of the form: >>>> >> > > >> scheme://<host>:<port>/path-1/path-2/.../path-n >>>> >> > > >> >>>> >> > > >> The FlightClient could send this path as either a header or a >>>> >> session >>>> >> > > >> property (with a neutral name like 'resource-path'). Flight SQL >>>> >> > > Producers >>>> >> > > >> could interpret this as a catalog or schema. >>>> >> > > >> eg >>>> >> > > >> grpc://<host>:<port>/catalog/schema >>>> >> > > >> >>>> >> > > >> On Fri, Nov 11, 2022 at 2:07 AM James Henderson <j...@juxt.pro> >>>> >> wrote: >>>> >> > > >> >>>> >> > > >>> Sounds good to me. >>>> >> > > >>> >>>> >> > > >>> > Are you interested in writing up a (sketch of a) proposal? >>>> >> > > >>> >>>> >> > > >>> Yep, can do - I'm OoO over the next couple of weeks so might be >>>> a >>>> >> bit >>>> >> > > >>> intermittent. >>>> >> > > >>> >>>> >> > > >>> On Thu, 10 Nov 2022 at 15:28, David Li <lidav...@apache.org> >>>> >> wrote: >>>> >> > > >>> >>>> >> > > >>> > Hey James H., >>>> >> > > >>> > >>>> >> > > >>> > That would make sense to me. So it sounds like we'd want >>>> >> > > >>> > >>>> >> > > >>> > - Formal specification of using cookies/headers to mark a >>>> >> 'session' >>>> >> > > (I >>>> >> > > >>> > guess this will be a little inconsistent with transactions, >>>> >> though) >>>> >> > > >>> > - Adding RPCs to query session values >>>> >> > > >>> > - Adding RPCs to set session values >>>> >> > > >>> > - Listing standard values and types >>>> >> > > >>> > >>>> >> > > >>> > Some things may require more consideration, e.g. transaction >>>> >> > > isolation >>>> >> > > >>> > might be better off as part of the transaction RPCs than an >>>> >> ambient >>>> >> > > >>> > property. Are you interested in writing up a (sketch of a) >>>> >> proposal? >>>> >> > > >>> > >>>> >> > > >>> > -David >>>> >> > > >>> > >>>> >> > > >>> > On Thu, Nov 10, 2022, at 10:09, James Henderson wrote: >>>> >> > > >>> > > Similarly, we're also currently considering how best to >>>> >> implement >>>> >> > > >>> some of >>>> >> > > >>> > > the SQL standard session variables in our Flight SQL server >>>> - >>>> >> > > things >>>> >> > > >>> like >>>> >> > > >>> > > current transaction isolation level, access mode, time zone >>>> >> etc, >>>> >> > > which >>>> >> > > >>> > seem >>>> >> > > >>> > > to have similar properties to the (traditional) connection's >>>> >> > > current >>>> >> > > >>> > > catalog. We'd (perhaps naively) looked at solutions >>>> involving >>>> >> the >>>> >> > > >>> Flight >>>> >> > > >>> > > client's `ClientCookieMiddleware`, but might these have >>>> >> > > standardised >>>> >> > > >>> > > support within Flight SQL itself eventually? >>>> >> > > >>> > > >>>> >> > > >>> > > Cheers, >>>> >> > > >>> > > >>>> >> > > >>> > > James >>>> >> > > >>> > > >>>> >> > > >>> > > On Wed, 9 Nov 2022 at 21:51, David Li <lidav...@apache.org> >>>> >> wrote: >>>> >> > > >>> > > >>>> >> > > >>> > >> I think having better support for this makes sense, but >>>> >> perhaps we >>>> >> > > >>> can >>>> >> > > >>> > >> find a way to make it not tied to the connection itself? >>>> For >>>> >> > > >>> instance, >>>> >> > > >>> > in >>>> >> > > >>> > >> the same way transactions were implemented (as a handle). >>>> Or >>>> >> > > rather, >>>> >> > > >>> > >> instead of adding connection statefulness to Flight RPC, >>>> I'd >>>> >> > > rather >>>> >> > > >>> try >>>> >> > > >>> > to >>>> >> > > >>> > >> work within the gRPC/RPC paradigm. >>>> >> > > >>> > >> >>>> >> > > >>> > >> On Wed, Nov 9, 2022, at 16:47, James Duong wrote: >>>> >> > > >>> > >> > Databases such as SQL Server and PostgreSQL have the >>>> >> concept of >>>> >> > > >>> > catalogs >>>> >> > > >>> > >> as >>>> >> > > >>> > >> > containers of database schemas. Users can usually >>>> specify an >>>> >> > > >>> initial >>>> >> > > >>> > >> > catalog during the connection process, list catalogs, and >>>> >> > > sometimes >>>> >> > > >>> > >> change >>>> >> > > >>> > >> > catalogs during the session. >>>> >> > > >>> > >> > >>>> >> > > >>> > >> > Currently catalog support in Flight SQL is somewhat >>>> >> limited. The >>>> >> > > >>> > protocol >>>> >> > > >>> > >> > provides a way to list catalogs as well as metadata in >>>> >> > > SqlTypeInfo >>>> >> > > >>> for >>>> >> > > >>> > >> > reporting how catalogs are supported from a syntax >>>> >> perspective. >>>> >> > > >>> > >> > >>>> >> > > >>> > >> > ODBC and JDBC provide API calls for changing the catalog. >>>> >> > > >>> > Additionally, >>>> >> > > >>> > >> > Flight SQL doesn't really provide the concept of >>>> "initial" >>>> >> > > >>> connection >>>> >> > > >>> > >> > properties (such as a starting catalog) since Flight >>>> itself >>>> >> is >>>> >> > > >>> > stateless >>>> >> > > >>> > >> > from a connection perspective. >>>> >> > > >>> > >> > >>>> >> > > >>> > >> > To support catalogs properly, I'd imagine we need to make >>>> >> some >>>> >> > > >>> > changes to >>>> >> > > >>> > >> > the Flight SQL protocol: >>>> >> > > >>> > >> > - Introduce the concept of connection-time properties >>>> >> (perhaps >>>> >> > > an >>>> >> > > >>> > >> optional >>>> >> > > >>> > >> > RPC for Flight SQL applications that need this) >>>> >> > > >>> > >> > - Related to the above, expand the connection URL and >>>> Java >>>> >> > > builder >>>> >> > > >>> to >>>> >> > > >>> > >> allow >>>> >> > > >>> > >> > arbitrary application-specific properties. >>>> >> > > >>> > >> > - Add optional RPCs for changing the catalog and relevant >>>> >> error >>>> >> > > >>> codes >>>> >> > > >>> > if >>>> >> > > >>> > >> > this is not permitted. >>>> >> > > >>> > >> > >>>> >> > > >>> > >> > >>>> >> > > >>> > >> > -- >>>> >> > > >>> > >> > >>>> >> > > >>> > >> > *James Duong* >>>> >> > > >>> > >> > Lead Software Developer >>>> >> > > >>> > >> > Bit Quill Technologies Inc. >>>> >> > > >>> > >> > Direct: +1.604.562.6082 | jam...@bitquilltech.com >>>> >> > > >>> > >> > https://www.bitquilltech.com >>>> >> > > >>> > >> > >>>> >> > > >>> > >> > This email message is for the sole use of the intended >>>> >> > > recipient(s) >>>> >> > > >>> > and >>>> >> > > >>> > >> may >>>> >> > > >>> > >> > contain confidential and privileged information. Any >>>> >> > > unauthorized >>>> >> > > >>> > >> review, >>>> >> > > >>> > >> > use, disclosure, or distribution is prohibited. If you >>>> are >>>> >> not >>>> >> > > the >>>> >> > > >>> > >> > intended recipient, please contact the sender by reply >>>> >> email and >>>> >> > > >>> > destroy >>>> >> > > >>> > >> > all copies of the original message. Thank you. >>>> >> > > >>> > >> >>>> >> > > >>> > > >>>> >> > > >>> > > >>>> >> > > >>> > > -- >>>> >> > > >>> > > *James Henderson* >>>> >> > > >>> > > XTDB Developer at *JUXT* >>>> >> > > >>> > > Email j...@juxt.pro >>>> >> > > >>> > > Website https://juxt.pro >>>> >> > > >>> > > >>>> >> > > >>> > > [image: photo] >>>> >> > > >>> > >>>> >> > > >>> >>>> >> > > >>> >>>> >> > > >>> -- >>>> >> > > >>> *James Henderson* >>>> >> > > >>> XTDB Developer at *JUXT* >>>> >> > > >>> Email j...@juxt.pro >>>> >> > > >>> Website https://juxt.pro >>>> >> > > >>> >>>> >> > > >>> [image: photo] >>>> >> > > >>> >>>> >> > > >> >>>> >> > > >> >>>> >> > > >> -- >>>> >> > > >> >>>> >> > > >> *James Duong* >>>> >> > > >> Lead Software Developer >>>> >> > > >> Bit Quill Technologies Inc. >>>> >> > > >> Direct: +1.604.562.6082 | jam...@bitquilltech.com >>>> >> > > >> https://www.bitquilltech.com >>>> >> > > >> >>>> >> > > >> This email message is for the sole use of the intended >>>> recipient(s) >>>> >> and >>>> >> > > >> may contain confidential and privileged information. Any >>>> >> unauthorized >>>> >> > > >> review, use, disclosure, or distribution is prohibited. If you >>>> are >>>> >> not >>>> >> > > the >>>> >> > > >> intended recipient, please contact the sender by reply email and >>>> >> destroy >>>> >> > > >> all copies of the original message. Thank you. >>>> >> > > >> >>>> >> > > > >>>> >> > > > >>>> >> > > > -- >>>> >> > > > >>>> >> > > > *James Duong* >>>> >> > > > Lead Software Developer >>>> >> > > > Bit Quill Technologies Inc. >>>> >> > > > Direct: +1.604.562.6082 | jam...@bitquilltech.com >>>> >> > > > https://www.bitquilltech.com >>>> >> > > > >>>> >> > > > This email message is for the sole use of the intended >>>> recipient(s) >>>> >> and >>>> >> > > may >>>> >> > > > contain confidential and privileged information. Any unauthorized >>>> >> > > review, >>>> >> > > > use, disclosure, or distribution is prohibited. If you are not >>>> the >>>> >> > > > intended recipient, please contact the sender by reply email and >>>> >> destroy >>>> >> > > > all copies of the original message. Thank you. >>>> >> > > >>>> >> > >>>> >> > >>>> >> > -- >>>> >> > >>>> >> > *James Duong* >>>> >> > Lead Software Developer >>>> >> > Bit Quill Technologies Inc. >>>> >> > Direct: +1.604.562.6082 | jam...@bitquilltech.com >>>> >> > https://www.bitquilltech.com >>>> >> > >>>> >> > This email message is for the sole use of the intended recipient(s) >>>> and >>>> >> may >>>> >> > contain confidential and privileged information. Any unauthorized >>>> >> review, >>>> >> > use, disclosure, or distribution is prohibited. If you are not the >>>> >> > intended recipient, please contact the sender by reply email and >>>> destroy >>>> >> > all copies of the original message. Thank you. >>>> >> >>>> >