[Fedora-commons-developers] gDatabase

Lodewijk Bogaards Wed, 12 Aug 2009 10:01:25 -0700

Hi,

Today I was thinking about a problem we are experiencing with Fedora
concerning performance. Without bothering you with the details of this
problem (we have already found a way), I just wanted to let you know my
thoughts when I started to think about "gDatabase".


I was thinking about something that could do something similar to gSearch,
but instead would write to a database. I have seen several projects in which
this is actually done, but where the client on top of Fedora handles this
instead of Fedora. There is no generic approach.
I was thinking however that such a service actually already exists within
Fedora, and it even comes with a sweet and well used rebuild tool. It came
to my thoughts that maybe it would be really easy to extend that same
mechanism. I got enthusiastic and started digging in Fedora to find out how
easy it would be to build an extension to that.

I found it very quickly that it would be pretty damn easy and would require
less than 200 lines of code.
Here is the breakdown I came up with: Fedora uses a dbspec file to read
which tables should be created, maintained and rebuilt (when a user decides
to use the rebuild tool). This file is parsed with saxon into a list of
TableSpec objects which contains a list of ColumnSpec objects. The tables
are not filled in a generic way, but instead very old looking code does the
work (see FieldSearchSqlImpl.update).
My thought would be that it would be easy to extend the dbspec file and the
TableSpec and ColumnSpec objects. I would think of a simple value selection
process based on a content model a datastream id and an xPath query. The
table could be matched to a content model, the datastream id could be
matched to a column alongside an xPath query. The xPath query would be
executed upon the datastream to retrieve the value that would be written to
the database column.
Selecting a content model or a datastream would not be necessary, but would
enable the user to create different tables for different content models and
provide some optimization on the querying process.
This process could be launched in exactly the same way as the
FieldSearchSqlImpl.update is launched.
For configuration I would think of (a) separate dbspec file(s), leaving the
current dbspec unchanged.

I think this would save people some time and would be extremely easy to
achieve, thus high return on investment. It would be fast, reliable,
synchronized, highly adaptable and integrated with the rebuild tool, thus
nothing extra needed. It would provide disseminators and services with
fast-access to the data without having to build their own database.

I have attached some of the code I have written this morning when I thought
I give it a try. Right now it does not look like I'll be finishing it though
because we don't need it, but I thought I'd share the idea through words and
code.

Hope to hear from somebody if this is of any interest.

Kind regards,

Lodewijk Bogaards

CustomFieldsSQLImpl.java
Description: Binary data

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

_______________________________________________
Fedora-commons-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

[Fedora-commons-developers] gDatabase

Reply via email to