Databases and the D Standard Library

Adam Wilson via Digitalmars-d Sat, 31 Dec 2016 19:26:10 -0800

Hi Everyone,

I've seen a lot of talk on the forums over the past year about the needfor database support in the D Standard Library and I completely agree.At the end of the day the purpose of any programming language and itsattendant libraries is to allow the developer to solve their problemsquickly and efficiently; and a large subset of those solutions requiresome form of structured data store. To my mind, this makes some form ofinterface(s) to a data-store an essential component of the D StandardLibrary. And since this is something that my particular problem spacesalso need, I thought it would be useful to attempt to do something about it.

First, I've seen a couple of promising projects, the most complete, andrecent of which, dstddb (Github: https://github.com/cruisercoder/dstddb)hasn't seen a commit since June. An additional setback came when I triedto use it and was greeted with a litany of compiler errors.

This is *not* a problem, it's the natural course of a volunteercommunity such as ours; and I want to thank Erik Smith profusely for hiswork. Priorities and circumstances change and that means that valuableprojects are inexplicably dropped.

But we still lack a critical component, and to get the conversationstarted, I'd like to break down the issues I've seen brought up in pastthreads on this subject and encourage you to bring your own. I may haveideas, but I can't possibly know the entire problem space.


1. Isn't this an enormous amount of work?

My answer: Absolutely, depending on your preferred scope of work.

In general, I've seen two distinct camps on this issue. One says that weshould implement everything in D from the ground up, includingre-implementation of the database drivers themselves in D. If this isyour preferred scope of work then you will invariably becomedisheartened at the truly stupendous amount of work you face and give up.

The other camp says that we should make use of existing drivers andinclude them in the D Standard Library. This is difficult path to followas the vanilla build of the D Standard Library now requires asignificant number of foreign libraries, all with differing licenses, bebuilt and distributed to everyone; regardless of whether or not they usethem in their project. This is more-or-less than path the dstddb is/was on.

My idea: Focus on defining the interface, not the individual driverimplementations.

If instead we focused on defining an interface that a "conformingimplementation" had to follow, we would allow developers to only pull inthe library they need or build a from-scratch library if they so desire.

Indeed this is the model that both Java (JDO) and .NET (ADO.NET) followand I think we would be well advised to follow their lead here. Not onlyis the methodology battle-proven, it is also well understood by asignificant portion of D's potential user-base. By way of example,Npgsql is the ADO.NET implementation of a driver for PostgreSQL.

2. There are so many different types of data storage systems, how do youdesign a system generic enough for all of them?

My answer: You don't. Nobody else has bothered trying, and I believethat our worry over that question is a large part of why we don't haveanything substantive today.


My idea: Split the data storage systems out by category of data-store.
For example:
        - SQL: std.database.sql (PostgreSQL, MySQL, MSSQL, etc.)
        - Document: std.database.document (Mongo, CouchDB, etc.)
        - Key-Value: std.database.keyvalue (Redis, etcd2, etc.)

If you want something that doesn't fit into a category above, you're ownyour own, but you were also on your own in other languages.

3. We need to provide a single interface for all data-stores in theSQL/Document/Key-Value category.

My answer: Are you sure? The problem is that each underlying data-storehas it's own dialect. For example, PostgreSQL and MSSQL are bothostensibly ANSI-SQL, except where they aren't. Re-targeting data-stores,even in the same category, is never going to be as simple as changing aconnection string. And additionally, you will have to implement asuper-set of features in the interface to support all the variations andthrow exceptions where the chosen implementation does not support aspecific feature.

My idea: Each data store has it's own implementation with it's ownnaming convention. For example (ADO.NET):

        - SqlConnection (MSSQL)
        - NpgsqlConnection (Npgsql)

Yes, this means that you have to change names in your code if you switchdata-stores, but since you are already changing your queries, which is amuch more difficult change, this isn't a significant additional cost.Also, the code becomes clearer to those who take over maintenance dutiesfrom the original author, especially when you are mixing data-stores.But in all honest, most developers will pick on technology and stickwith it for the entirety of the software's lifespan.

4. We should hide querying from the developer because they are bad atit, security flaws, etc.

My answer: While agree in principal, especially with the securityconcerns, in reality what you are asking for is an ORM. In my opinion,that is a separate concern from a database interface, and is typicallyimplemented as layer over the DB interface.


My idea: Don't do it. Save it for a different project.

5. D has so many useful features for data access, we should use as manyas possible!

My answer: D absolutely has many useful features for data access andmanipulation. But that doesn't mean that a good interface has to use anyof them. The first job of a Database Interface, and indeed any library,is to get the job done with a minimum of overhead. Let's worry aboutthat before going crazy adding in all the D goodness. Ranges have been aparticular target for abuse here, and while I love ranges, I think themechanics of data-store manipulation don't lend themselves well toworking with ranges. I'd love to hear your ideas on this though.

My idea: Focus on a more conservative implementation in the style of JDOor ADO.NET. This will allow us to ship something that works in areasonable time frame. I'm not saying that we can't use any of D'sunique talents, but using those talents should be subordinate todesigning an interface that works efficiently.

That is all I have for now. I am looking forward to hear your thoughtson this topic! Until then, I am going to go close out 2016 (PST) withfamily and friends and I wish you all a Happy New Year!


--
Adam Wilson
IRC: LightBender
import quiet.dlang.dev;

Databases and the D Standard Library

Reply via email to