Hi Beau, I will write more later - I have to get out the door in a minute. Was this intended to be off-list? May I bounce it to the list?
Hans :) * Beau Sharbrough [Fri, 12 Jul 2002 at 11:26 -0500] <quote> > Hans, > > This has been a fruitful discussion, I think. If I could offer a few > thoughts from a LWG perspective (even though Velke and Anderson know a great > deal more than I do about it).... > > * The GDM was never mean to be a database design. I know that you've said > that many times but it bears repeating. In this case it's useful to repeat > because you are concerned about redundant storage and the LWG was not > thinking about storage. At the same time, they were thinking about the > relationships between entities and perhaps this one is one that can be > decomposed. > > If we oversimplify (because that helps me understand), let's instantiate > some of these classes. > > Repository - Library. > Source - Book. > > In theory, if I associate a book with a library I am describing their > collection. I could associate a lot of sources with a repository, including > call numbers and their condition, without being involved in a genealogical > search. I'm not certain, but I think that this association might best be > referred to as a CATALOG, which is a well-established model for that > association. > > I think that the LWG may have thought that all linking of sources to > repositories would take place as the result of a research activity, hence > the association of activity to this association of sources and repositories. > > On reflection, it seems reasonable to have two separate associations - one > of SOURCE to REPOSITORY (called CATALOG?), and another of ACTIVITY to > SOURCE-REPOSITORY (or CATALOG). > > I don't think that the LWG ever imagined that the Allen County Public > Library might ever publish an electronic catalog that was compatible with a > GDM compatible client. Hey, it was 1996. > > Now it doesn't seem so far-fetched that a GDM compatible client could > contain links to online catalogs - assuming that they aren't being revised > in ways that break the links. > > Does that complicate the issue sufficiently? > > Beau > > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of > Hans Fugal > Sent: Wednesday, July 10, 2002 11:08 PM > To: [EMAIL PROTECTED] > Subject: Re: [gdmxml] more thoughts on entering a source > > > I spent a while wrestling this out with my brother Jacob today. There > are situations where one would need to know more than just > repository-id and source-id. For instance, if a particular repository > had more than one copy of source and you wanted to indicate which one > you had searched, repository-id and source-id are not sufficient - you > would also need to know the call-number. But the call-number itself is > not unique so can't be used as the primary key in repository-source. > Using activity-id as the third key doesn't seem to work though, because > of the extreme redundancy I pointed out. I think repository-source needs > an id field as a primary key, then search can reference that > repository-source-id instead of having repository-id and source-id, and > we take activity-id out of repository-source. > > Jacob also helped me see the light on these associative tables (like > repository-source and source-group-source). While I understood their > importance in a database context, I was tempted to collapse them a bit > in xml context. While that's possible to do while still keeping data > integrity, it is better to keep it separate. > > As always, I welcome your feedback... > <hans/> > > * Stan Mitchell [Tue, 9 Jul 2002 at 23:12 -0700] > <quote> > > Yes, it does seem that your suggestion reduces redundancy > > without sacrificing search capability. > > > > Hans Fugal wrote: > > > > >But then you have to store call-numbers possibly many times. For > > >example, a professional researcher would doubtless perform many searches > > >in any particular US Census. For that Census the repository, source, call > > >number and description would all be the same for every repository-source > > >record. The only unique information in each record would be the > > >activity-id. Yet if we take out the activity-id from repository-source > > >we get rid of that redundancy. AFAICS there is no loss of querying power > > >when we do so - search has all three keys, so if you want to know which > > >searches you did on a particular call-number, you only have to query the > > >search table with the repository-id and source-id. Or am I still > > >missing something? > > > > > > > > > > > > _______________________________________________ > > gdmxml mailing list > > [EMAIL PROTECTED] > > http://fugal.net/cgi-bin/mailman/listinfo/gdmxml > </quote> > > -- > "Everybody is talking about the weather but nobody does anything about it." > -- Mark Twain > > _______________________________________________ > gdmxml mailing list > [EMAIL PROTECTED] > http://fugal.net/cgi-bin/mailman/listinfo/gdmxml </quote> -- "Everybody is talking about the weather but nobody does anything about it." -- Mark Twain _______________________________________________ gdmxml mailing list [EMAIL PROTECTED] http://fugal.net/cgi-bin/mailman/listinfo/gdmxml