Author handling bfe_authors.py et al
Hi! Currently, bfe_authors.py uses authors = [] authors_1 = bfo.fields('100__') authors_2 = bfo.fields('700__') This enforces authors to be considered only if both indicators are blank. However, you may notice that from the authoritative definition at http://www.loc.gov/marc/bibliographic/bd100.html One should actually set at least indicator 1 like 100 0_ $aWinston Churchill 100 1_ $aChurchill, Winston to distingish storage of firstname lastname vs. lastname, firstname, not to metion stuff like 100 3_ $aFarquhar family Given the fact that if we upload foreign data we win a lot of indicators here I suggest to change code for author handling to use authors = [] authors_1 = bfo.fields('100%%') authors_2 = bfo.fields('700%%') wherever it is applicable. This is also relevant for indexing, where the default definition is __ as well. Just got some 50.000 anonymous papers which I'll right now give back to their authors ;) -- Kind regards, Alexander Wagner Subject Specialist Central Library 52425 Juelich mail : a.wag...@fz-juelich.de phone: +49 2461 61-1586 Fax : +49 2461 61-6103 www.fz-juelich.de/zb/DE/zb-fi Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt
Re: Author handling bfe_authors.py et al
Hello Alexander, Currently, bfe_authors.py uses authors = [] authors_1 = bfo.fields('100__') authors_2 = bfo.fields('700__') This enforces authors to be considered only if both indicators are blank. However, you may notice that from the authoritative definition at http://www.loc.gov/marc/bibliographic/bd100.html this is an example of the default values of Invenio not being Marc21 compliant. I've complained about this a few times, and I should have filled a task about those defaults, and sent a few patches, although I haven't yet ;-(. The problem of the default values not being correct has those consequences: if you import records from another catalog, those values mishave in Invenio. Librarians (and library-educated computer people) expect those records to behave like in the other system. So it is interoperability and economy. The problem arises not only in those Python bibformat snippets, but also in bibindex definition, and all export formats. For example, my bfe_author.py has those fields: if (authors_type in ['','personal']): authors.extend(bfo.fields('100%%')) if (authors_type in ['','corporate']): authors.extend(bfo.fields('110%%')) if (authors_type in ['','meeting']): authors.extend(bfo.fields('111%%')) if (authors_type in ['','personal']): authors.extend(bfo.fields('700%%')) if (authors_type in ['','corporate']): authors.extend(bfo.fields('710%%')) if (authors_type in ['','meeting']): authors.extend(bfo.fields('711%%')) if (authors_type in ['','personal','corporate','meeting']): authors.extend(bfo.fields('720%%')) (In my case I have the need of show sometimes personal or corporate authors, depending of the collection; I understand that it is not always the case). In bibindex (/admin/bibindex/bibindexadmin.py/field), you have also to add those fields: author 100%, 110%, 111%, 700%, 710%, 711%, 720% In the Marcxml to DC xls xls as well: xsl:for-each select=datafield[(@tag=100 or @tag=110 or @tag=111)] dc:creator xsl:call-template name=subfieldSelect xsl:with-param name=codesab/xsl:with-param /xsl:call-template /dc:creator /xsl:for-each xsl:for-each select=datafield[(@tag=700 or @tag=710 or @tag=711 or @tag=720)] dc:contributor xsl:call-template name=subfieldSelect xsl:with-param name=codesab/xsl:with-param /xsl:call-template /dc:contributor /xsl:for-each I borrowed the following xsl function from somewhere (LC, I think): !--- Added FJ 5-feb-2010 to resolve template -- xsl:template name=subfieldSelect xsl:param name=codesabcdefghijklmnopqrstuvwxyz/xsl:param xsl:param name=delimeter xsl:text /xsl:text /xsl:param xsl:variable name=str xsl:for-each select=subfield xsl:if test=contains($codes, @code) xsl:value-of select=text()/ xsl:value-of select=$delimeter/ /xsl:if /xsl:for-each /xsl:variable xsl:value-of select=substring($str,1,string-length($str)-string-length($delimeter))/ /xsl:template And so on. It is a major task, but much needed. Newcomers are likely to feel frustated due to the system not behaving as espected. Ferran
Re: Author handling bfe_authors.py et al
Hello Alexander, this is an example of the default values of Invenio not being Marc21 compliant. Right. And then these are bad defaults. I've complained about this a few times, and I should have filled a task about those defaults, and sent a few patches, although I haven't yet ;-(. The reasons why I haven't done it myself, besides the lack-of-time issue (bad excuse) are that on my instances I have a mix of better-than-default values and local ones; I don't have (or I don't have the resources to have) a reasonably recent Invenio instance running anywhere (we are stilll at 0.99.1), so I'd be patching something old; and, even with those restrictions, when I tried, I found those example records (modules/miscutil/sql/tabfill.sql) and the testing infrastructure that I didn't know how to handle. So I feel overwhelmed each time I try ;-( But idealy one should be able to go, for example, to http://www.archive.org/details/ol_data and get and load all University of Toronto Library catalog in the local Invenio and use it, maybe just adjusting some valid collection field value. Now it is not the case. And it is a pity, because after the suitable adjustments, Invenio is very able to handle them. It is even possible to have something like authority records in it (at least we have them more-or-less working at http://traces.uab.cat/). Best regards, Ferran
Re: Author handling bfe_authors.py et al
On 25.11.2011 11:59, Ferran Jorba wrote: Hi! this is an example of the default values of Invenio not being Marc21 compliant. Right. And then these are bad defaults. I've complained about this a few times, and I should have filled a task about those defaults, and sent a few patches, although I haven't yet ;-(. The reasons why I haven't done it myself, besides the lack-of-time issue (bad excuse) are that on my instances I have a mix of better-than-default values and local ones; Well, it would be great if you could drop me some sort of list in case your previous post was not complete. We're about to roll out some installation here based on recent Invenio so we might work that in if it's not already done. So the suggestion would be: give me what you have and I'll check against current git master (probably some weeks back). [...] But idealy one should be able to go, for example, to http://www.archive.org/details/ol_data and get and load all University of Toronto Library catalog in the local Invenio and use it, maybe just adjusting some valid collection field value. Agree. But I don't need to go to Toronto I'd just start out with out own catalogue. Still, it's more cumbersome to fiddle out everything again you might already have found in your (local) patches. -- Kind regards, Alexander Wagner Subject Specialist Central Library 52425 Juelich mail : a.wag...@fz-juelich.de phone: +49 2461 61-1586 Fax : +49 2461 61-6103 www.fz-juelich.de/zb/DE/zb-fi Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt
Re: Author handling bfe_authors.py et al
Hi again, The reasons why I haven't done it myself, besides the lack-of-time issue (bad excuse) are that on my instances I have a mix of better-than-default values and local ones; Well, it would be great if you could drop me some sort of list in case your previous post was not complete. We're about to roll out some installation here based on recent Invenio so we might work that in if it's not already done. Let me publish my logical fields list here on the list, because it is easy and likely to be useful to most readers (I've left off a few local fields): 4. Logical fields overview _ |Field__|MARC_Tags|Translations___| | |00%, 01%, 02%, 03%, 04%, 05%,| | | |06%, 07%, 08%, 09%, 10%, 11%,| | | |12%, 13%, 14%, 15%, 16%, 17%,| | | |18%, 19%, 20%, 21%, 22%, 23%,| | | |24%, 25%, 26%, 27%, 28%, 29%,| | | |30%, 31%, 32%, 33%, 34%, 35%,| | | |36%, 37%, 38%, 39%, 40%, 41%,| | | |42%, 43%, 44%, 45%, 46%, 47%,|ca, cs, de, el, en, es, fr, it,| |any_field |48%, 49%, 50%, 51%, 52%, 53%,|no, pt, ru, sk, sv, uk | | |54%, 55%, 56%, 57%, 58%, 59%,| | | |60%, 61%, 62%, 63%, 64%, 65%,| | | |66%, 67%, 68%, 69%, 70%, 71%,| | | |72%, 73%, 74%, 75%, 76%, 77%,| | | |78%, 79%, 80%, 81%, 82%, 83%,| | | |84%, 85%, 86%, 87%, 88%, 89%,| | | |90%, 91%, 92%, 93%, 94%, 95%,| | |___|96%,_97%,_98%|___| |title |130%, 210%, 222%, 240%, 245%,|ca, cs, de, el, en, es, fr, it,| |___|246%,_247%,_730%,_740%___|no,_pt,_ru,_sk,_sv,_uk_| |author |100%, 110%, 111%, 700%, 710%,|ca, cs, de, el, en, es, fr, it,| |___|711%,_720%___|no,_pt,_ru,_sk,_sv,_uk_| |abstract |520% |ca, cs, de, el, en, es, fr, it,| |___|_|no,_pt,_ru,_sk,_sv,_uk_| |keyword|653% |ca, cs, de, el, en, es, fr, it,| |___|_|no,_pt,_ru,_sk,_sv,_uk_| |series_|830%,_440%,_490%_|ca,_en,_es_| |subject|600%, 610%, 611%, 650%, 651% |ca, cs, de, el, en, es, fr, it,| |___|_|no,_pt,_ru,_sk,_sv,_uk_| |fulltext |8564%u |ca, cs, de, el, en, es, fr, it,| |___|_|no,_pt,_ru,_sk,_sv,_uk_| |collection |980% |ca, cs, de, el, en, es, fr, it,| |___|_|no,_pt,_ru,_sk,_sv,_uk_| |year |260%c, 973%y |ca, cs, de, el, en, es, fr, it,| |___|_|no,_pt,_ru,_sk,_sv,_uk_| |record_ID |001 |ca, cs, de, el, en, es, fr, it,| |___|_|no,_pt,_ru,_sk,_sv,_uk_| |issn___|773%x,_022%a_|ca,_en,_es,_fr_| The indexes is one-to-one with this one *except* for keyword. What we have done is to keep the proper subject tags on the official 600, 610, 611, 650 and 653 and keyword as 653, but merge them as *indexes*, so the index for keyword (PĂ gina inicial Admin Area Manage Indexes) has both the subject and the keyword fields. That's the solution we've come with. And about the bibformat and friends, that is: lib/python/invenio/bibformat_elements/ etc/bibformat/format_templates/ etc/bibformat/output_formats/ I keep them under guilt patches (http://repo.or.cz/w/guilt.git or http://packages.debian.org/guilt), but they would only apply to a 0.99.1 release. I can happily send you a tarball for each; but please understand there is a mix of better, worse and bad solutions, as I have been learning to tame the beast over those years. I'll come to you back in a while. Cheers, Ferran
Hide certain MARC subfields from xml output in the search interface
Hello everyone, There is a need to store some 'sensitive' data in the MARC record, that should be viewable/editable by the librarian, however it should not appear in the xm/MARCXML/(text)MARC output format of the search interface. After spending a some time testing where this could be applied, I realized that although this simple check could be put in the print_record function of search_engine.py, the fact that the xm format of the record is already cached and is read and displayed as it is, renders this 'hack' useless. I verified that if I force on_the_fly=True in format_record function of bibformat.py, I get what I want with HUGE performance drop and this is unacceptable. Is there another way to make this work? Is the cached xm data (in the bibfmt table) used for something else than display? Should I try to strip the sensitive data from the record only when updating this table? Is this possible? Any ideas are welcome! Thanks in advance for your time, Best regards, Theodoros Theodoropoulos