I could also see extensions to ODBC/JDBC being a point of confusion for app developers too.
For example, if we were to add hooks in the JDBC driver to report endpoints so that applications can call getStream() directly, what would happen if the user started getting a stream then went back and tried to use the regular ResultSet interface? A stream would be consumed, but the driver wouldn't know it. On Tue, Mar 15, 2022 at 9:07 AM Kyle Porter <ky...@bitquilltech.com.invalid> wrote: > In general, I have problems with attempting to expose other extensions > through existing standards such as ODBC/JDBC. What it feels like we're > saying is: use the standard so you don't have to change any code, except > for this part where you must write custom code to take advantage of the > non-standard portions. > > At that point, why not just write something fully custom and take advantage > of the underlying interface? > > The higher level clients are meant to ease adoption and may be all that > existing applications use, but new applications can have a choice to use > the higher level clients or the lower level interface. > > *Kyle Porter* > CEO > Bit Quill Technologies Inc. > Office: +1.778.331.3355 | Direct: +1.604.441.7318 | ky...@bitquilltech.com > https://www.bitquill.com > > This email message is for the sole use of the intended recipient(s) and may > contain confidential and privileged information. Any unauthorized review, > use, disclosure, or distribution is prohibited. If you are not the > intended recipient, please contact the sender by reply email and destroy > all copies of the original message. Thank you. > > > On Tue, Mar 15, 2022 at 7:55 AM David Li <lidav...@apache.org> wrote: > > > Aren't we getting a few things mixed up here? > > > > 1) As Micah says, the original proposal is about adapting Java types to > > Arrow. This can be used independently of Flight SQL. I don't think this > was > > being pitched as a standard itself unless I'm mistaken? > > > > 2) Flight SQL the protocol, which _is_ a language agnostic standard, > > though maybe not the one applications will generally choose to consume. > > > > 3) Idiomatic/standard per-language APIs that build on Flight SQL, which > > will include JDBC/ODBC (there is a reference JDBC driver in the works > [1]), > > but I agree there's room for something that uses Arrow types, supports > > partitioning, etc. as well. (And I agree there's room for something that > > supports these features but is _not_ Flight SQL underneath.) > > > > --- > > > > I'm not super experienced with JDBC/ODBC - would extending them basically > > mean something like (in JDBC) providing interfaces that Connections, > > ResultSets, etc. could be cast to to access the "Arrow-native" bits? And > in > > ODBC, using something like the SQL_C_BINARY type to 'tunnel' Arrow data > > through ODBC buffers, and/or providing a set of C API functions that > could > > convert between (say) an ODBC statement handle and an Arrow C Data > > Interface ArrowArrayStream? > > > > [1]: https://github.com/apache/arrow/pull/12254 > > > > -David > > > > On Tue, Mar 15, 2022, at 01:06, Micah Kornfield wrote: > > > Hi Julian, > > > > > > > > >> I like Gavin’s idea of a data-frame API. But Gavin, if you want to > make > > it > > >> successful, build it on top of the leading API in each language (which > > in > > >> Java would be FlightSQL’s JDBC driver). I don’t see a good reason to > > expose > > >> through your API the fact that FlightSQL is underneath. > > > > > > > > > My understanding is that this thread is all about implementing a Flight > > > server and making those ergonomics easier. On the client side, I think > > the > > > power of Flight/FlightSQL is two fold: > > > 1. Reference ODBC/JDBC drivers that can consume the wire format (and I > > > think many clients will go this route). I think these are in the > process > > > of being contributed already. Which as you noted there is power in > > > standards, so I expect this avenue to see heavy use. > > > 2. For clients that can handle it and want to go through the trouble, > > > consuming the data directly as Arrow for efficiency purposes. I don't > > > think we've discussed canonical APIs by extending ODBC/JDBC but I like > > that > > > idea. That seems like a discussion for after we have working JDBC/ODBC > > > reference implementation though? > > > > > > I might have missed it but I don't think either approach on the client > > side > > > has been discussed on this thread. I also think this is why Dataframe > > > might not be the best name for the adapter because it comes with all > > sorts > > > of assumptions about usage both on a client and a server. > > > > > > Cheers, > > > Micah > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 14, 2022 at 9:38 PM Julian Hyde <jhyde.apa...@gmail.com> > > wrote: > > > > > >> When I read “language-agnostic standard for data access” I cringed a > > >> little. (See [1].) > > >> > > >> Sure, it’s fun to create a new standard. But if your standard is > > >> successful, there will need to be a huge amount of work changing > > existing > > >> code to use your standard. That effort might even be difference > between > > >> success and failure for a small project, and therefore you have helped > > >> protect the incumbents. > > >> > > >> My solution? > > >> > > >> I would like the FlightSQL authors to make clear that it is a wire > > >> protocol, and only a protocol. > > >> > > >> Rather than creating new APIs, I would like people to spend their > effort > > >> implementing existing APIs (such as ODBC and JDBC) on top of > FlightSQL. > > >> > > >> If those APIs are inadequate (e.g. they don’t provide access to the > raw > > >> Arrow data, or don’t support INSERT or SELECT that are partitioned > > across > > >> several clients/servers), then add extensions to those APIs. But still > > >> implement the core APIs. When I describe a table from Java, I want to > a > > >> result set that exactly matches JDBC’s getTables [2]. > > >> > > >> I like Gavin’s idea of a data-frame API. But Gavin, if you want to > make > > it > > >> successful, build it on top of the leading API in each language (which > > in > > >> Java would be FlightSQL’s JDBC driver). I don’t see a good reason to > > expose > > >> through your API the fact that FlightSQL is underneath. > > >> > > >> Julian > > >> > > >> [1] https://xkcd.com/927/ <https://xkcd.com/927/> > > >> > > >> [2] > > >> > > > https://docs.oracle.com/javase/8/docs/api/java/sql/DatabaseMetaData.html#getTables-java.lang.String-java.lang.String-java.lang.String-java.lang.String:A- > > >> < > > >> > > > https://docs.oracle.com/javase/8/docs/api/java/sql/DatabaseMetaData.html#getTables-java.lang.String-java.lang.String-java.lang.String-java.lang.String:A- > > > > > >> > > >> > > >> > > >> > On Mar 12, 2022, at 12:14 PM, Gavin Ray <ray.gavi...@gmail.com> > > wrote: > > >> > > > >> > While trying to implement and introduce the idea of adopting > > FlightSQL, > > >> the > > >> > largest challenge was the API itself > > >> > > > >> > I know it's meant to be low-level. But I found that most of the > > >> development > > >> > time was in code to convert to/from > > >> > row-based data (IE Map<String, Object>) and Java types, and columnar > > >> data + > > >> > Arrow types. > > >> > > > >> > I'm likely in the minority position here -- I know that Arrow and > > >> FlightSQL > > >> > users are largely looking at transferring large volumes of data and > > >> > servicing OLAP-type workloads > > >> > But the thing that excites me most about FlightSQL, isn't its > > performance > > >> > (always nice to have), but that it's a language-agnostic standard > for > > >> data > > >> > access. > > >> > > > >> > That has broad implications -- for all kinds of data-access > workloads > > and > > >> > business usecases. > > >> > > > >> > The challenge is that in trying to advocate for it, when presenting > a > > >> > proof-of-concept, > > >> > rather than what a developer might expect to see, something like: > > >> > > > >> > // FlightSQL handler code > > >> > List<Map<String, Object>> results = ....; > > >> > results.add(Map.of("id", 1, "name", "Person 1"); > > >> > return results; > > >> > > > >> > A significant portion of the code is in Arrow-specific > implementation > > >> > details: > > >> > creating a VectorSchemaRoot, FieldVector, de-serializing the results > > on > > >> the > > >> > client, etc. > > >> > > > >> > Just curious whether there is any interest/intention of possibly > > making a > > >> > higher level API around the basic FlightSQL one? > > >> > Maybe something closer to the traditional notion of a row-based > > >> "DataFrame" > > >> > or "Table", like: > > >> > > > >> > DataFrame df = new DataFrame(); > > >> > df.addColumn("id", ArrowTypes.Int); > > >> > df.addColumn("name", ArrowTypes.VarChar); > > >> > df.addRow(Map.of("id", 1, "name", "Person 1")); > > >> > VectorSchemaRoot root = df.toVectorSchemaRoot(); > > >> > listener.setVectorSchemaRoot(root); > > >> > listener.sendVectorSchemaRootContents(); > > >> > > >> > > > -- *James Duong* Lead Software Developer Bit Quill Technologies Inc. Direct: +1.604.562.6082 | jam...@bitquilltech.com https://www.bitquilltech.com This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure, or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Thank you.