Fwd: OT: Resources for Data Base Design

Devin Asay Tue, 11 May 2010 08:26:05 -0700

Sivakatirswami,

I sent your message on to a colleague who is an expert in text markup schemes 
and this was his reply. It may imply a slightly different direction from what 
you have started with.


HTH

Devin

Begin forwarded message:

> From: Jarom McDonald 
> Date: May 11, 2010 8:42:40 AM MDT
> To: Devin Asay <devin_a...@byu.edu>
> Subject: Re: OT: Resources for Data Base Design
> 
> Hi Devin,
> 
> At least in the world of academia, what he's looking for just isn't done. 
> Whether for philosophical reasons, common practice reasons, or whatever, 
> there is very little work done in decomposing texts for relational models. 
> Rather, texts are kept whole and marked up in XML, which to most people 
> preserves the complexity of the text and facilitates publishing and 
> dissemination.
> 
> This isn't to say that relational models can't be useful; I have seen 
> products where texts are marked up in the TEI schema (the standard for XML 
> encoding of text) and then elements are chopped up and put into a DB; 
> however, you can achieve similar levels of performance with an XML database. 
> The most used is called eXist; there are plenty of scripts you can find by 
> googling TEI + eXist that can help in storing XML docs in the XML database, 
> querying with xQuery and XPath to find documents, creating indices, etc.
> 
> Of course, this probably doesn't help much, as Revolution has native support 
> for RDBMs but not for XML databases. But for full texts, the relational route 
> just isn't used in academia on any sort of wide scale.
> 
> Jarom
> 
> On Mon, May 10, 2010 at 9:36 AM, Devin Asay <devin_a...@byu.edu> wrote:
> Jarom,
> 
> This came over the Revolution mail list. Any recommendations I could point 
> him to? The last long paragraph details what he's looking for.
> 
> Devin
> 
> 
> Begin forwarded message:
> 
>> From: Sivakatirswami <ka...@hindu.org>
>> Date: May 8, 2010 7:53:08 PM MDT
>> To: How to use Revolution <use-revolution@lists.runrev.com>
>> Subject: OT: Resources for Data Base Design
>> Reply-To: How to use Revolution <use-revolution@lists.runrev.com>
>> 
>> I'm working on a content management database based on the Dublin Core 
>> and the Media Annotation Initiative. Much of the whole mode of discourse 
>> and terms translate well into a database scheme but when the discourse 
>> starts to talking about fine tuning and switches to an RDF framework it 
>> is difficult to grok in terms of translating some of the principles into 
>> actual table-field structures in a PostGreSQL dbase. the Dubline Core 
>> seems in some respects a very abstract realm... but things are different 
>> where rubber hits the road.
>> 
>> I've looked pretty closely at the databases generated by XOOPS, Drupal 
>> and Word Press and frankly, they are freaky scary. I see a hodge podge 
>> of strategies, each differing -- depends on whose design the module 
>> whose tables you are looking at. That's why I want to stay with Dublin 
>> Core where the "human readability" principle is kept in the forefront of 
>> design.  I'm pretty close to designing a schema that I think can contain 
>> pretty much all the metadata for any video, text or audio, translations 
>> pamphlets etc. FAQ  that we have. I supposed we are re-inventing the 
>> wheel a bit, but in the end we will get something that is a good match 
>> for our needs and we will not be boxed into framework of a monster CMS 
>> that we cannot customize without spending huge $ on PHP-module 
>> consultants... (been there, done that, nightmare)
>> 
>> Metadata for a video or a sound file or an image is simple enough....
>> 
>> The part of the data base I'm unable to finish of is that which deals 
>> with text fragments.  I think I posted this before on this list but got 
>> no responses. If anyone knows what would be the best list or group I 
>> should go to, to get help, let me know. What I'm interested in should be 
>> pretty standard stuff in the world of academia: e.g. if you want a data 
>> base to contain the most atomic elements of a text resource (one record 
>> for every single verse of every single poem from a book where the poems 
>> are divided into chapters and the chapters into sections and the 
>> sections into parts of a book, and the book is one volume in a 
>> series...)   what is the best schema which allows you to query the data 
>> base to re-aggregate all those elements into it's original source 
>> document, run time (or on a cron or periodically post modifications)  
>> AND OR what other approaches might better serve  the end game (be able 
>> to query for a single verse with complete citation; be able to query for 
>> an entire poem with citation; be able to query for a complete chapter of 
>> poems with a citation ... etc.)  I have some solutions in mind, and I 
>> may just proceed with those, and refactor later if something better 
>> comes along...but I would love to hear from some experts and seem some 
>> existing models.
>> 
>> Any ideas of where to go looking for mangos?
>> 
> 

Devin Asay
Humanities Technology and Research Support Center
Brigham Young University

_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Fwd: OT: Resources for Data Base Design

Reply via email to