Hi everyone, I'm a PhD student at Carnegie Mellon working on database research, co-advised by Andy Pavlo and Jignesh Patel. I've worked on DuckDB in the past through my research (https://github.com/duckdb/duckdb/pull/7528), and I'm currently doing an internship at Columnar.
My main internship project has been developing a DuckDB extension for ADBC. The extension lets DuckDB users connect to Snowflake, Databricks, BigQuery, PostgreSQL, MySQL, and any other system with an ADBC driver. The extension supports querying ADBC databases directly through a `read_adbc` table function. It also supports using `ATTACH` to connect to an ADBC database and then running `SELECT`, `INSERT`, `COPY`, and CTAS statements as if the database were local to DuckDB. The repo is now public here: https://github.com/columnar-tech/duckdb-adbc-client/ For those following recent work at the intersection of DuckDB and ADBC, you may have seen that community member Rusty Conover previously published an AI-developed `adbc_scanner` community extension for DuckDB. We took a different approach with this extension, choosing to hand-code the core pieces as part of the academic goals of my internship. The extension also integrates with ADBC connection profiles, aims for broad database compatibility, and includes automatic connection pooling, automatic metadata caching, and memory-efficient `INSERT` and CTAS support through streaming bulk ingest operations. Since we made the repo public, we have also seen some of these ideas and capabilities begin to appear in related community work. That is allowed under the Apache 2.0 license, and we want to be clear that we welcome experimentation and reuse. At the same time, the speed with which AI-assisted development can absorb and repackage work makes attribution, coordination, and shared governance especially important. Columnar is also a financial supporter of DuckLabs, and we care about keeping the DuckDB/ADBC ecosystem collaborative and healthy. With that in mind, although we initially developed this extension under the Columnar GitHub organization for convenience, we are interested in donating it to the Arrow project and moving it to an ASF repo. There is some work to do to assess the feasibility and details, including how DuckDB release cycles would interact with ADBC release cycles. But we think it would be valuable to have an official ADBC client extension for DuckDB maintained under ASF governance, much like the official ADBC client libraries for various languages. Our hope is that this could provide a neutral place for third-party contributors, including Rusty and others in the DuckDB and ADBC communities, to collaborate rather than maintaining separate DuckDB/ADBC extensions with overlapping goals. I’d appreciate feedback from the Arrow community on whether this seems like a good direction and what the right next steps would be. In the meantime, please take a look at the repo, try the extension using the instructions in the README, and open issues for any bugs, compatibility problems, or design feedback: https://github.com/columnar-tech/duckdb-adbc-client/ - Sam
