http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=10662
--- Comment #41 from David Cook <dc...@prosentient.com.au> --- (In reply to Katrin Fischer from comment #33) > I was told recently that 2-3 seconds is quite standard for OAI-PMH harvests. > > I think a problem could occur if Zebra is involved in matching as you have > to make sure the indexes have caught up before you can reliably match. Say a > record is changed at the source twice in a very short timeframe... or added > and then changed again, included in 2 harvests... but not yet indexed when > the second runs, etc. I agree once again with Katrin. I think I've said before (either here or via email) that using Zebra for matching can be very unreliable. Currently, I use the unique OAI-PMH identifiers to handle all harvested records, and that's quite robust, since that identifier should be persistent. However, that obviously doesn't help with matching OAI-PMH harvested records against local records created via other methods. In the short-term, perhaps merging bibliographic records would have to occur manually. Or maybe a deduplication tool could be created to semi-automate that task... although I think that tool would have to prevent any deletion of OAI-PMH harvested records. Actually, this hearkens back to my previous comment. It would be good if each record had a simple way of identifying its origin. So you couldn't delete a record obtained via OAI-PMH unless its parent repository was deleted from Koha or unless you used a OAI-PMH management tool to delete records for that repository. I think providing this "source" or "origin" would need to be done consistently or rather... extensibly. I wouldn't want it to be OAI-PMH specific as that would be short-sighted. At the moment, everything that goes through svc/import_bib uses a webservices import_batch... but that's not very unique. It would be interesting to have unique identifiers for import sources. So you might use the svc/import_bib with the connexion_import_daemon.pl, or with MARCEdit, or your home-grown script, or whatever. It would be interesting to distinguish those separately... and maybe prevent writes/deletions for records that are entered via connexion_import_daemon.pl and home-grown script XYZ, while leaving ones imported via MARCEdit to be managed however since you just exported some original records and re-imported them via MARCEdit after making some changes. One way of doing that would actually be to use developer keys... so a developer would need to get a key from Koha before using the web service and then the Koha sysadmin could handle the interaction between that service and Koha's internals using that key (e.g. if records are imported via Webservice A, prevent Koha users from doing anything with them). I suppose that's a bit tougher to do with OAI-PMH... but not necessarily. When a new OAI-PMH repository is added, the system could generate a key for it, and use that key for handling the permissions for Koha users... I think that element of the discussion would relate a lot to http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=14957... -- You are receiving this mail because: You are watching all bug changes. _______________________________________________ Koha-bugs mailing list Koha-bugs@lists.koha-community.org http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/