Aaron,

I'm taking a look through some of the database modules you and Terrence have mentioned, and I have a few first impressions of DBIx::Declare. It seems, as least from the SYNOPSIS, that the module itself could deduce the "type," "requires," and "provides" statements by analyzing the corresponding SQL rather than have the programmer write these. Moreover, the real substance of the SQL statements appears to be the JOIN (i.e. foreign key) relationships concerning how the tables link up. So, the SQL might be automatically generateable from the table definitions (stored in the database) and a definition of the foreign key relationships (which itself might may or may not already be encoded in the database depending on the capabilities of the database used). Therefore, in some cases the entire "__DATA__" section in the SYNOPSIS can be removed as implicit!

As is, how locking across multiple SQL statements, sorting (SORT BY), and ranges requests (LIMIT) will be handled remains uncertain. One might apply different sorts on the same base SQL statement if, for example, an HTML table had buttons to choose the column and order to sort by. Then there is the question of how to disambiguate two instances of the same required or provided "data type" in the same context (e.g. "city" as in city born in v.s. "city" as in city currently residing). Specifically, tables in the database represent classes, while data displayed on web pages often represent instances of those classes. DBIx::Declare would seem to have to have to be told *about those objects* rather than just the classes (tables) unless the simple case was always taken (i.e. one instance object per class).

Still, I believe something declarative like this would be useful. SQL itself is intended to be declarative but only at the level of the individual statement--not the program as a whole--unless, of course, the program is one big SQL statement. What might be called for is something similar to a knowledgebase (e.g. Cyc, www.cyc.com) implemented on top of a relational database and allowing you to make inferences on both classes and objects given the known entity relationships amoung them. So, I query the knowledgebase with "If I know such and such, tell me such and such."

-davidm

Aaron J.Mackey wrote:


regarding "still requires your Perl code to know the names of your SQL statements" ... if you take a look at your ABSTRACT example. The process of getting the SQL and creating $dbx is missing.


The SQL can live in any format/object, as long as it somehow includes sufficient provides/requires "metadata" for DBIx::Declare to figure out which statements need to be executed to generate the desired data. To keep things short in the SYNOPSIS I left the SQL with the Perl code, but it could be loaded from a master "dictionary" file, or some form of Phrasebook object with added metadata.

Presuming it is created in the same way as the SYNOPSIS, then the SQL is now wed to a particular template. Since you probably often find the need for reusing the same query in a wide variety of the small bio-informatic apps you write, you need a way of resourcing the same SQL in a number of different apps completely independant of which template is using it so that you can reuse it. The only way I can see this as doable is via a Class::Phrasebook or SQL::Catalog.


Again, the idea of DBIx::Declare is that you *do* have one dictionary/phrasebook/catalog of SQL lying around somewhere that everything can make use of; but the way a script/template/etc makes use of the SQL is by telling DBIx::Declare "Look, I need these three data pieces, and here's what I already have - using this global store of SQL statements, figure out what needs to be done", as opposed to "Go use the SQL named 'get_user_info' to provide me x, y and z given a and b".

regarding how to provide the right data. I have to admit you are right. I think the key thing would be to ask if it makes the code sufficiently cleaner. Your examples seem fine for single row SELECTs but how about returning multiple rows of data?


Great question; my current thought process is that the returned data is tightly bound to field names. For example, an SQL query might return one or multiple rows:

my ($output) = $dbx->process(@stuff);

$output is a hashref with "wanted" field names as keys; the values of the hash would be scalars if only one row was returned, or arrayrefs for multiple rows.

Or, maybe the values of the hashref should always be arrayrefs, so that your code wouldn't have to worry about whether one row or many were returned.

Or, maybe the $output hashref itself should be tied such that a FETCH in scalar context provided the first element of the arrayref (so that code could access it in either fashion). You could also imagine that the tied version had a built-in "iterator" so that looping would be feasible (thus allowing a more memory-sensitive fetchrow() approach):

my ($output) = $dbx->process(@stuff);

# looping construct version 1: every FETCH increases the iterator
while(my @val = values %$output) {
  my %data;
  @data{keys %$output} = @val;
}

# looping construct version 2: iterator functions
while ($output->next_row) {
  # use %$output directly as a hash of row values.
}

Thoughts?

Also, I can't help but mention that the requires and provides parts of your SQL description are strongly reminiscent of stored procedures.


Sure; SQL is a declarative language, after all. But stored procedures still must be called by name, not by context.

Finally, where is this module obtained?


Currently, only in the dark nether regions of my cranium. Once I'm satisfied that I've at least tried to satisfy comments such as yours, I'll actually do the work of typing it out.

Thanks for your time and thoughts,

-Aaron





Reply via email to