Hans, This has been a fruitful discussion, I think. If I could offer a few thoughts from a LWG perspective (even though Velke and Anderson know a great deal more than I do about it)....
* The GDM was never mean to be a database design. I know that you've said that many times but it bears repeating. In this case it's useful to repeat because you are concerned about redundant storage and the LWG was not thinking about storage. At the same time, they were thinking about the relationships between entities and perhaps this one is one that can be decomposed. If we oversimplify (because that helps me understand), let's instantiate some of these classes. Repository - Library. Source - Book. In theory, if I associate a book with a library I am describing their collection. I could associate a lot of sources with a repository, including call numbers and their condition, without being involved in a genealogical search. I'm not certain, but I think that this association might best be referred to as a CATALOG, which is a well-established model for that association. I think that the LWG may have thought that all linking of sources to repositories would take place as the result of a research activity, hence the association of activity to this association of sources and repositories. On reflection, it seems reasonable to have two separate associations - one of SOURCE to REPOSITORY (called CATALOG?), and another of ACTIVITY to SOURCE-REPOSITORY (or CATALOG). I don't think that the LWG ever imagined that the Allen County Public Library might ever publish an electronic catalog that was compatible with a GDM compatible client. Hey, it was 1996. Now it doesn't seem so far-fetched that a GDM compatible client could contain links to online catalogs - assuming that they aren't being revised in ways that break the links. Does that complicate the issue sufficiently? Beau -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Hans Fugal Sent: Wednesday, July 10, 2002 11:08 PM To: [EMAIL PROTECTED] Subject: Re: [gdmxml] more thoughts on entering a source I spent a while wrestling this out with my brother Jacob today. There are situations where one would need to know more than just repository-id and source-id. For instance, if a particular repository had more than one copy of source and you wanted to indicate which one you had searched, repository-id and source-id are not sufficient - you would also need to know the call-number. But the call-number itself is not unique so can't be used as the primary key in repository-source. Using activity-id as the third key doesn't seem to work though, because of the extreme redundancy I pointed out. I think repository-source needs an id field as a primary key, then search can reference that repository-source-id instead of having repository-id and source-id, and we take activity-id out of repository-source. Jacob also helped me see the light on these associative tables (like repository-source and source-group-source). While I understood their importance in a database context, I was tempted to collapse them a bit in xml context. While that's possible to do while still keeping data integrity, it is better to keep it separate. As always, I welcome your feedback... <hans/> * Stan Mitchell [Tue, 9 Jul 2002 at 23:12 -0700] <quote> > Yes, it does seem that your suggestion reduces redundancy > without sacrificing search capability. > > Hans Fugal wrote: > > >But then you have to store call-numbers possibly many times. For > >example, a professional researcher would doubtless perform many searches > >in any particular US Census. For that Census the repository, source, call > >number and description would all be the same for every repository-source > >record. The only unique information in each record would be the > >activity-id. Yet if we take out the activity-id from repository-source > >we get rid of that redundancy. AFAICS there is no loss of querying power > >when we do so - search has all three keys, so if you want to know which > >searches you did on a particular call-number, you only have to query the > >search table with the repository-id and source-id. Or am I still > >missing something? > > > > > > > _______________________________________________ > gdmxml mailing list > [EMAIL PROTECTED] > http://fugal.net/cgi-bin/mailman/listinfo/gdmxml </quote> -- "Everybody is talking about the weather but nobody does anything about it." -- Mark Twain _______________________________________________ gdmxml mailing list [EMAIL PROTECTED] http://fugal.net/cgi-bin/mailman/listinfo/gdmxml _______________________________________________ gdmxml mailing list [EMAIL PROTECTED] http://fugal.net/cgi-bin/mailman/listinfo/gdmxml