Looking at the info, there's already some SqlInfo values that correspond to
indicating support for transaction isolation types, so we just need to add
the ability to request the isolation desired when Beginning a transaction,
I'll aim to suggest an edit to the document later today or tomorrow.

On Fri, Feb 17, 2023 at 2:02 PM David Li <lidav...@apache.org> wrote:

> I think it'd make sense. Do you want to propose a section in the doc?
>
> I think it'd make the most sense to just define enums/flags in Flight SQL
> instead of arbitrary options. (A corresponding set of SqlInfo values could
> indicate support for each of them.)
>
> On Thu, Feb 16, 2023, at 16:16, Matthew Topol wrote:
> > While implementing Transaction handling for ADBC via Flight SQL's
> > transaction primitives, another potential enhancement would be to expand
> > the BeginTransaction request to include a spot for "options" such as
> > IsolationLevel or marking a transaction as ReadOnly.
> >
> > Anyone have thoughts on this?
> >
> > On Wed, Feb 15, 2023 at 10:19 AM David Li <lidav...@apache.org> wrote:
> >
> >> The ADBC and Flight SQL proposals have been updated for
> >> Micah/Taeyun/Will's comments.
> >>
> >> On Wed, Feb 15, 2023, at 09:17, David Li wrote:
> >> > Hi Taeyun,
> >> >
> >> > Thanks for the detailed feedback!
> >> >
> >> > - I will clarify that PollFlightInfo should return as quickly as
> >> > possible on the first call, and that updates in progress value are
> also
> >> > OK (though the server shouldn't spam updates). (I wanted to avoid
> >> > streaming calls as it does not work as well with browser-based gRPC
> >> > clients.)
> >> > - I will clarify cancel_descriptor to note that it is optional.
> >> > - I wanted to avoid adding several new RPC methods, but if there is
> >> > rough agreement that these would be generally useful, I will add them
> >> > and deprecate the Flight SQL message [3]. (We could also possibly
> >> > define 'standard' DoAction Protobuf messages, but I worry about
> >> > implementation [1]. I may prototype this first, since then we could
> >> > avoid having redundant paths in Flight RPC/Flight SQL.) If we do this,
> >> > I think we do not need cancel_descriptor. (It can work like
> >> > CancelQuery.)
> >> > - I meant that CancelQuery should work with a partial FlightInfo from
> a
> >> > PollFlightInfo response. However this doesn't work if there's no
> >> > endpoints in the response! I will add app_metadata fields to
> >> > FlightInfo/FlightEndpoint. I think this can also be useful for
> >> > applications that need to add their own semantics to these messages
> >> > anyways, since Ticket is not meant to be parsed by the client. (You
> >> > could stuff the info into the schema, but that also doesn't work if
> the
> >> > schema is not yet known.)
> >> >
> >> > As for the partial DoGet: I think this is interesting and we can
> >> > discuss. Google BigQuery Storage supports this use case [2]. As you
> >> > note, if you are using this to request only a few rows, you may not
> >> > benefit much from Arrow.
> >> >
> >> > [1]: The C++ Protobuf library makes it difficult to define and share
> >> > messages across multiple shared libraries. On Windows, protoc does not
> >> > properly insert dllimport/dllexport macros (despite claiming to), and
> >> > on Unixes Protobuf interacts oddly with our linker script/symbol
> >> > hiding. This would be a lot of work, but I wonder if we could use an
> >> > implementation like upb/nanopb that does not rely on global state for
> >> > Arrow. This would also hopefully ease conflicts with projects that
> want
> >> > to use their own Protobuf definitions - as with Substrait. The main
> >> > challenge here is getting them to work with gRPC; I think we would
> have
> >> > to handroll the gRPC code that is normally generated. This may not be
> >> > too bad, just undocumented/may not be a stable API, and it would also
> >> > let us avoid the iffy casting we currently do to bypass gRPC's
> >> > serialization.
> >> > [2]:
> >> >
> >>
> https://github.com/googleapis/googleapis/blob/1870ba2163526fa9fba63bf899c92707476d4603/google/cloud/bigquery/storage/v1/storage.proto#L268-L282
> >> > [3]: It may be time to consider explicit versioning of Flight
> >> > RPC/Flight SQL?
> >> >
> >> > On Tue, Feb 14, 2023, at 20:45, Taeyun Kim wrote:
> >> >> Hi David,
> >> >>
> >> >> Thank you very much for your proposal.
> >> >> My comments about it are as follows:
> >> >>
> >> >> About PollFlightInfo:
> >> >>
> >> >> Many SQL queries (in fact, almost all OLAP queries?) cannot produce
> any
> >> >> output records until it completes - because of GROUP BY or ORDER BY
> >> clause.
> >> >> In that case, PollFlightInfo can degenerate to GetFlightInfo since
> the
> >> >> server will not respond unless there are changes to the result. If
> the
> >> >> 'progress' field of RetryInfo is also regarded as the result, the
> >> server can
> >> >> respond with a different progress value. But the server that does not
> >> know
> >> >> the progress information cannot use that.
> >> >> The client can call the RPC with a timeout to avoid arbitrarily long
> >> >> polling, but in that case, the client would not be able to get a
> >> descriptor
> >> >> for cancellation of the query if the first PollFlightInfo does not
> >> return
> >> >> soon. Maybe it should be specified that the server processing
> >> PollFlightInfo
> >> >> must return immediately after it parses the query and starts
> executing
> >> it to
> >> >> provide the cancel_descriptor as soon as possible.
> >> >> Regarding cancel_descriptor, it would be nice for the server to
> unset it
> >> >> even if the query is still in progress, to notify the client that the
> >> query
> >> >> cancellation is not supported.
> >> >> BTW, I thought of something like StreamingGetFlightInfo, which is a
> >> >> bidirectional streaming version of PollFlightInfo. But maybe
> >> PollFlightInfo
> >> >> is better since the other client that does not own the GRPC call
> stream
> >> can
> >> >> cancel the query. (Or maybe StreamingGetFlightInfo can send
> >> >> cancel_descriptor for use outside the stream.)
> >> >>
> >> >> About CloseQuery:
> >> >>
> >> >> I think that it would be great if the RPC call is in Flight RPC
> rather
> >> than
> >> >> in FlightSQL RPC since the FlightInfo that it tries to close is got
> from
> >> >> GetFlightInfo/PollFlightInfo in Flight RPC. In that case, maybe it
> >> would be
> >> >> nice to name it 'CloseFlightInfo', to be matched with GetFlightInfo.
> >> >>
> >> >> About RefreshQuery:
> >> >>
> >> >> Same as CloseQuery. Maybe it can be named 'RetainFlightInfo'.
> >> >>
> >> >> About CancelQuery:
> >> >>
> >> >> I don't know how to use it. CancenQuery requires FlightInfo from the
> >> server.
> >> >> But by the time the client receives FlightInfo, the query has been
> >> already
> >> >> completed, doesn't it?
> >> >>
> >> >> Another (unrelated?) request (not in the proposal):
> >> >>
> >> >> In DoGet, the client must consume the whole endpoint. It can make it
> >> >> difficult for a client who only wants to or can retrieve only a small
> >> >> portion of it. (For example, there may be a web client that displays
> the
> >> >> result in tabular format page-by-page. A web server can cache the
> DoGet
> >> >> result, but by doing that the web server must manage a state. A
> >> stateful web
> >> >> server is harder to implement and manage.) Can we have a variant of
> >> DoGet
> >> >> that only retrieves a portion of an endpoint? That RPC method can
> have
> >> >> record_offset and record_count arguments. (Maybe it defeats the
> purpose
> >> of
> >> >> Flight RPC which prefers fast, bulk transfer.)
> >> >>
> >> >> Thank you.
> >> >>
> >> >> -----Original Message-----
> >> >> From: David Li <lidav...@apache.org>
> >> >> Sent: Wednesday, February 15, 2023 8:06 AM
> >> >> To: dev@arrow.apache.org
> >> >> Subject: Re: [DISCUSS] Flight RPC/Flight SQL/ADBC enhancements
> >> >>
> >> >> Ah, right. I haven't written up the last set of ADBC proposals yet.
> >> I'll do
> >> >> that in the next day or two.
> >> >>
> >> >> On Tue, Feb 14, 2023, at 17:38, Will Jones wrote:
> >> >>> Hi David,
> >> >>>
> >> >>> The proposals in the Flight/Flight SQL document look excellent. As
> >> >>> I've been looking at ADBC I've been wondering about polling / async
> >> >>> execution, cancellation, and progress indicators. Glad to see those
> in
> >> >>> the Flight document, but where are they in the ADBC issues? Do they
> >> >>> still need to be created?
> >> >>>
> >> >>> Best,
> >> >>>
> >> >>> Will Jones
> >> >>>
> >> >>> On Tue, Feb 14, 2023 at 12:58 PM David Li <lidav...@apache.org>
> wrote:
> >> >>>
> >> >>>> Hello,
> >> >>>>
> >> >>>> I would like to submit some Flight RPC and Flight SQL enhancements
> >> >>>> for discussion. They cover the following:
> >> >>>>
> >> >>>> - Executing 'queries' in a retryable, nonblocking way
> >> >>>> - Handling ordered result sets
> >> >>>> - Handling expiration of/re-reading result sets
> >> >>>>
> >> >>>> In addition, there are corresponding proposals for ADBC in
> >> >>>> anticipation of these features, James's catalogs proposal for
> Flight
> >> >>>> SQL, and other feedback.
> >> >>>>
> >> >>>> The Flight proposals are described in this document [1]. It should
> be
> >> >>>> open for comments.
> >> >>>> The ADBC proposals are filed as individual issues in this milestone
> >> [2].
> >> >>>>
> >> >>>> Any feedback is much appreciated. There are not yet prototype
> >> >>>> implementations, but if there is a rough consensus then I can
> begin on
> >> >> that.
> >> >>>>
> >> >>>> [1]:
> >> >>>>
> https://docs.google.com/document/d/1jhPyPZSOo2iy0LqIJVUs9KWPyFULVFJXT
> >> >>>> ILDfkadx2g/edit?usp=sharing
> >> >>>> [2]: https://github.com/apache/arrow-adbc/milestone/3
> >> >>>>
> >> >>>> Thanks,
> >> >>>> David
> >> >>>>
> >>
>

Reply via email to