Il giorno mar, 15/03/2011 alle 09.31 +0100, Alexander Wagner ha scritto: > Am 15.03.2011 01:55, schrieb Tibor Simko: > > Hi! > > >> OK, let's assume we would like at least to preserve the OAI ID in the > >> MARC, just to be sure if the record is in any time changed of recid, > >> the OAI ID is still preserved to guarantee uniqueness. > > > > Yes, it is generally advantageous if OAI ID is kept in MARC together > > with the rest of the record metadata. > > From a pure users point of view I admit that it is generally a good > idea to keep as much as possible within MARC... > > [...] > > > Having OAI Set information in MARC alongside OAI ID may be nice for > > consistency and other reasons though. It may be good for those Invenio > > instances that do not use OAI Repository Updater but manage OAI IDs and > > Sets otherwise. If all this information lives in MARC, then one may use > > regular cataloguing tools such as record editor, multi-record editor and > > friends in order to check and manipulate this information. > > ... for exactly that reason. It is pretty easy to keep informations of > whatever kind up to date if one can work on them within tools like the > record editor. This can be handled by every librarian while other areas > might need a programmer. Marc also allows manipulation on a very high > abstraction level and there're a bunch of tools for working on MARC.
It's true that the more information are in the MARC the more easy it is to build tool on that. On the other hand this comes at a price. My initial quest for putting outside of MARC the list of OAI sets to which a record belongs is merely related to performance: currently, in Invenio, the oai_repository_updater is incredibly slow, because it has to compute all the sets, and check, for each record if the record need to be updated. And this activity is linear on the number of records. (Infact it can be optimized by using the search_engine, but still any record getting into or out of a set need to be touched by oai_repository_updater, triggering all sorts of derived updates). If OAI sets were kept outside, then to update an OAI set it would be just a matter of computing the search queries that compose the set and save the list of resulting recids. That would mean that the computation would be linear on the number of sets. In CDS for example this would bring the computing time of oai_repository_updater down from ~20mins to less than a minute. What about of having both ways, with the former only optionally enabled, in case one is maintaining OAI sets by hand (and not via oai_repository_updater)? At CDS we would keep this turned off, while it can be turned on on other installations... Cheers, Sam -- Samuele Kaplun Invenio Developer ** <http://invenio-software.org/>
