AW: [librecat-dev] A common MARC record path language

Klee, Carsten Wed, 19 Feb 2014 05:30:13 -0800

Hi Thomas and Patrick!

I think the whole problem lies in the limited expressivity of strings. MARCspec 
is pretty much close to XPath at its approach, but without regular expressions 
and functions like first(), last() etc. But even with XPath it would be pretty 
hard to get the character before a subfield in a MARCXML file.


The only solution I can think of, is using regular expressions. And I'm not 
convinced that bringing this into MARCspec is a good idea. As I already 
mentioned in the spec, MARCspec is not independent from the application using 
MARCspec. Taking regular expressions into MARCspec wouldn't make the 
application more usable, but would blow up the specification. 

One example:

The data in field 245 is:

"$aConcerto per piano n. 21, K 467$h[sound recording] /$cW.A. Mozart"

The desired result is (rule: take everything from 245 until the string ' /$' 
appears):

"Concerto per piano n. 21, K 467 [sound recording]"

Imagine a MARCspec with regular expression. // pseudo code coming up!

marcspec = "245.match(/(.*)\s\/\$/)"
titleData = getMARCspec(record, marcspec)
print titleData[1]
// should result in "$aConcerto per piano n. 21, K 467$h[sound recording]"

Now pretty the same but without the regular expression in the MARCspec.

marcspec = "245"
titleData = getMARCspec(record, marcspec).match(/(.*)\s\/\$/)
print titleData[1]
// should result in "$aConcerto per piano n. 21, K 467$h[sound recording]"

You see, nothing won here.

But an application could provide a special function like

function 
takeEverythingFromSpecUntilYouHitBeforeSubfield(marcspec,hitWhat,record)
{
    // get the data before the / or = or else
    regex = new RegExp("(.*)\\s\\" + hitWhat + "\\$")
    data = getMARCspec(record, marcspec).match(regex)[1]

    // now split on subfield
    dataSplit = data.split(/\$[a-z0-9]/)

    // loop everything into result
    for (i = 1; i < dataSplit.length-1; i++)
    {
        result += dataSplit[i] + " "
    }
    result += dataSplit[dataSplit.length]

    return result 
}

In Catmandu or elsewhere the user calls the function

takeEverythingFromSpecUntilYouHitBeforeSubfield("245","/",record)

--> this should result in the desired "Concerto per piano n. 21, K 467 [sound 
recording]".

If there is any other approach you can think of, pleeeease make a proposal or 
give me a substantial discussion here. Otherwise I can't see any options 
solving this problem in MARCspec.

Cheers!

Carsten
_______________________________________________
Carsten Klee
Abt. Überregionale Bibliographische Dienste IIE
Staatsbibliothek zu Berlin - Preußischer Kulturbesitz

Fon:  +49 30 266-43 44 02

> -----Ursprüngliche Nachricht-----
> Von: Thomas Berger [mailto:t...@gymel.com]
> Gesendet: Mittwoch, 19. Februar 2014 01:04
> An: Klee, Carsten; 'Patrick Hochstenbach'
> Cc: v...@gbv.de; librecat-...@mail.librecat.org; perl4lib@perl.org
> Betreff: Re: [librecat-dev] A common MARC record path language
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> 
> Am 18.02.2014 17:47, schrieb Klee, Carsten:
> 
> > I understand that there is MARC data combined with cataloging rules. We
> > don't use this approach within our MARC. So I'm not really aware of the
> problematics.
> 
> "Your" MARC however will be very much interested in "/" (or "=") as the
> first
> character of some subfield in 245 if I recall correctly. Not such a big
> difference I would think. But maybe a slight complication of the matter,
> since MARCspec should have to cope with both approaches...
> 
> Thomas Berger
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> 
> iJwEAQECAAYFAlMD9NYACgkQYhMlmJ6W47PzEQP/RIfm5bsHLTwhJMLJjNjF3vO/
> XIpKt98CPUgy+hcFXc4hpTi+UH8j7NIWtaCyXYOfdL4xryzI0kEk98brZ/4TJG+9
> IxzPZ8WDQL8bjX1hRTF8P4qjn/u+nyvDFFvdbM4kH7QhYhPeeWfoVqtCnMFHLzFJ
> 7v+o6x2CKH2MnfOcgGI=
> =yBFy
> -----END PGP SIGNATURE-----

AW: [librecat-dev] A common MARC record path language

Reply via email to