Re: A library agnostic datastructure for MARC ?
hi galen, On Thu, Nov 11, 2010 at 07:40:35PM -0500, Galen Charlton wrote: > I don't see how a structure like this gets you anywhere closer to an > abstraction layer that would permit somebody to code in terms of > semantic concepts like title and author instead of MARC tags, It doesn't: the fact is we're working on libraries to do that (see MARC::Mapper from my other mail) and i really would like to interact both with - MARC::Record: it's heavily used in the koha ILS - the Frederic Demians's MARC lib which is much more modern - what we at biblibre call a SimpleRecord which is just a hash of non ordered fields i don't want to write a web of gateways for all those structures and those to come so i propose to have a common way to share between all of our works. For example: MARC::Template and ISO2709 have internal code to build MARC::Records and SimpleRecords so they depends on MARC::Record. I really would like to drop this code for something more generic and simple. > you're looking for a serialization or data structure that is more i'm not talking about serialization at all, i'm talking about sharing data between marc related tools as PSGI does for the web thing. sorry if i wasn't clear. regards -- Marc Chantreux BibLibre, expert en logiciels libres pour l'info-doc http://biblibre.com
Re: A library agnostic datastructure for MARC ?
Hi, On Thu, Nov 11, 2010 at 7:27 PM, Marc Chantreux wrote: > simple proposition is: > > [ [qw/ 001 value /] # example of control field > , [qw/ 005 value /] # example of control field > , [ [qw/ 200 0 1 /] # example of data field > , [ [qw/ a foo /] > , [qw/ b bar /] > , [qw/ a foo2 /] [snip] I don't see how a structure like this gets you anywhere closer to an abstraction layer that would permit somebody to code in terms of semantic concepts like title and author instead of MARC tags, but if you're looking for a serialization or data structure that is more convenient for you to deal with, you might find searching the code4lib mailing list archives for "MARC" and "JSON" to be fruitful. Regards, Galen -- Galen Charlton gmcha...@gmail.com
WIP stuff about MARC manipulation
hello again, As i wrote my last mail, guys at biblibre really would like to write tools to ease the programmer in charge of migration process. MARC::Template is something we're very happy about but we have more tools that aren't as polished. But we already successfully use them so we share them with you as they are. https://github.com/eiro/p5-ISO2709 is a regex driven ISO2709 parser. The goal of this library is not performance but flexibility: we where able to read very baddly formatted ISO2709 with just by changing some few things in the regex. I never published it for now because i didn't find time to make the things configurable instead of hackable: that's a proof of concept but i'm very happy about the results. There is https://github.com/eiro/p5-MARC-Data which is a first step to have a very simple way to deal with CIB using Moose metaprogramming and serialization definition. for example, UNIMARC Biblio 100$a definition is: ( [qw( entered 8 Str ) , POSIX::strftime('%Y%m%d',localtime )] # 0-7Date Entered on File (Mandatory) , [qw( publication_type 1 Str u )] # 8 Type of Publication Date , [qw( publication1 4 )] # 9-12 Publication Date 1 , [qw( publication2 4 )] # 13-16 Publication Date 2 , [qw( audience 3 Str u )] # 17-19 Target Audience Code , [qw( government1 Str u )] # 20 Government Publication Code , [qw( modified 1 Str 0 )] # 21 Modified Record Code , [qw( language 3 Str fre )] # 22-24 Language of Cataloguing (Mandatory) , [qw( transliteration 1 Str y )] # 25 Transliteration Code , [qw( charset1 4 Str 5050)] # 26-29 Character Set (Mandatory) , [qw( charset2 4 )] # 30-33 Additional Character Set , [qw( title_script 2 Str ba )] # 34-35 Script of Title ) Another idea is MARC::Mapper http://www.tinybox.net/2009/08/06/a-marc-mapper-in-few-lines-of-perl/ but i really think it could be written using MARC::Template syntaxes. regards -- Marc Chantreux BibLibre, expert en logiciels libres pour l'info-doc http://biblibre.com
A library agnostic datastructure for MARC ?
hello world, Years ago, i wrote MARC::Template (https://github.com/eiro/MARC-Template) to ease the process of migrating data to koha ILS. We often use it at biblibre (http://biblibre.com). For a MARC to MARC migration, just making some manipulations CRUD manipulations on fields, cleaning some data, moving some fields, API is an awfull waste of time: learn to manipulate perl structures is much more efficient imho. For a more complex migration mixing data coming from multiple datasources and multiple formats, or even to write some migration from MARC to a modern biblio format, we're convinced that the job can be done better and faster by adding a level of abstraction over MARC. What i mean about abstraction is that the business programmer, as well as the librarian, don't carre about the 999$x field: he carres about authors, titles, year of edition ... That can be partially done by a YAML driven Moose metaprogramming. Actually: i personnally think that is would be possible to write a complete GUI driven ETL able to deal with MARC. At the very end of the process, we transform everything as MARC::Record to use the MARC::Record serialization but the Frederic's lib can be a good output for our libs. So is there a chance to specify a library agnostic datastructure as a bridge for all your libs, a kind of PSGI for MARC so everyone could import and export to this format so we can easily mix all of them ? simple proposition is: [ [qw/ 001 value /] # example of control field , [qw/ 005 value /] # example of control field , [ [qw/ 200 0 1 /] # example of data field , [ [qw/ a foo /] , [qw/ b bar /] , [qw/ a foo2 /] , [qw/ b bar2 /] ] ] , [ [qw/ 200 0 1 /] # example of data field , [ [qw/ a foo /] , [qw/ b bar /] , [qw/ a foo2 /] , [qw/ b bar2 /] ] ] ] regards, -- Marc Chantreux BibLibre, expert en logiciels libres pour l'info-doc http://biblibre.com
Re: Moose based Perl library for MARC records
2010/11/11 Frédéric DEMIANS : > Thanks all for your suggestions. I have to choose another name for sure. > Marc::Moose seems to be a reasonable choice. But I'm very tempted by a > shorter option: MarcX, MarcX::Record, MarcX::Parser, MarcX::Reader::Isis, > etc. Any objection? > I can't think of a better choice than MARC::Moose::, e.g., MARC::Moose::Record. There are a lot of MARC::Something's out there, and what differentiates yours from those appears to be Moose. Yes, you might rewrite it in the future using Marmoset, in which case I'd probably suggest renaming it to MARC::Marmoset::Record. That way, Moose enthusiasts could take over maintaining the Moose version. If you catch my drift. I'm not crazy about it necessarily. I just can't think of anything I like better. Sleeping on it some more might be in order, maybe. Just my .10 francs worth, Brad
Re: Moose based Perl library for MARC records
CPAN stores distributions under author subdirectories. But the module namespace is done separately and reflects the function of the module. In the case of the MARC:: namespace, I think Ed Summers is the only one who has remained involved since the beginning (back in the 1990's). Had we used names at the start, it would have been BBIRTH::MARC (which would have been confusing to absolutely everyone even back then) since I uploaded the first release to CPAN. Also, uppercase MARC:: is the preferred CPAN practice for an acronym of this sort. Compare to existing module names like CGI, DBI, ODBC, ASP, and PDL. -bill On Thu, 2010-11-11 at 20:39 +0100, Frédéric DEMIANS wrote: > > butting in an interesting discussion ... > > Thanks for joining the discussion. > > > Would Org::Demians::MARC::Record ( or Tamil::MARC::Record ) be very > > wrong, unless you aim to provide the ultimate collection of MARC > > modules that would make all the others obsolete ? > > Yes, I aim to... In the Java world, I would have name it > fr.tamil.marc... I'm not sure it's the usage in CPAN. And there is this > suggestion to stay under MARC:: umbrella. > > > Moose is great and I love it, but it's not forever ... in a few years > > we'll use Elk or something else, and you might want to port your > > modules ... > > I have other plan for the future...
Re: Moose based Perl library for MARC records
> butting in an interesting discussion ... Thanks for joining the discussion. > Would Org::Demians::MARC::Record ( or Tamil::MARC::Record ) be very > wrong, unless you aim to provide the ultimate collection of MARC > modules that would make all the others obsolete ? Yes, I aim to... In the Java world, I would have name it fr.tamil.marc... I'm not sure it's the usage in CPAN. And there is this suggestion to stay under MARC:: umbrella. > Moose is great and I love it, but it's not forever ... in a few years > we'll use Elk or something else, and you might want to port your > modules ... I have other plan for the future...
Re: Moose based Perl library for MARC records
Hello, butting in an interesting discussion ... Would Org::Demians::MARC::Record ( or Tamil::MARC::Record ) be very wrong, unless you aim to provide the ultimate collection of MARC modules that would make all the others obsolete ? Moose is great and I love it, but it's not forever ... in a few years we'll use Elk or something else, and you might want to port your modules ... Emil 2010/11/11 Frédéric DEMIANS : > >> I was going to express the same concern. Keeping everything under >> MARC:: may also make it a tiny bit easier to find the existing >> alternatives for, well, parsing MARC records. I would +1 MARC::Moose. > > I understand this point. I don't like the idea of using 'Moose' in the name > of object using Moose. As this library is a MARC::Record alternative, as you > said, why not simply Marc::Alt? > >> Also, to be purely pedantic, "MARC" is an acronym for "MAchine-Readable >> Cataloguing", while "Marc" is a person's name, so where-ever it ends up, >> please keep it uppercase. > > On this point, my convention is just to begin any element of class by an > uppercase and then lowercase. This way there is no need to think about it: > is it an acronym? should I say Koha or KOHA? SOLR or SolR? (private joke) > > But I will think about it since MARC::Record is so widely used. > > Thanks. > -- == Emil Perhinschi http://www.lunch-break.ro ==
Re: Moose based Perl library for MARC records
I was going to express the same concern. Keeping everything under MARC:: may also make it a tiny bit easier to find the existing alternatives for, well, parsing MARC records. I would +1 MARC::Moose. I understand this point. I don't like the idea of using 'Moose' in the name of object using Moose. As this library is a MARC::Record alternative, as you said, why not simply Marc::Alt? Also, to be purely pedantic, "MARC" is an acronym for "MAchine-Readable Cataloguing", while "Marc" is a person's name, so where-ever it ends up, please keep it uppercase. On this point, my convention is just to begin any element of class by an uppercase and then lowercase. This way there is no need to think about it: is it an acronym? should I say Koha or KOHA? SOLR or SolR? (private joke) But I will think about it since MARC::Record is so widely used. Thanks.
Re: Moose based Perl library for MARC records
Gah. Replying to all this time instead of just Galen, as I did three hours ago, for my $0.02... 2010/11/11 Galen Charlton : > Hi, > > 2010/11/11 Frédéric DEMIANS : >> Thanks all for your suggestions. I have to choose another name for sure. >> Marc::Moose seems to be a reasonable choice. But I'm very tempted by a >> shorter option: MarcX, MarcX::Record, MarcX::Parser, MarcX::Reader::Isis, >> etc. Any objection? > > Not from me, but I'm not sure if the CPAN folks will want yet another > top-level namespace. I was going to express the same concern. Keeping everything under MARC:: may also make it a tiny bit easier to find the existing alternatives for, well, parsing MARC records. I would +1 MARC::Moose. Also, to be purely pedantic, "MARC" is an acronym for "MAchine-Readable Cataloguing", while "Marc" is a person's name, so where-ever it ends up, please keep it uppercase. -- Dan Scott Laurentian University
RE: Moose based Perl library for MARC records
2010/11/11 Frédéric DEMIANS : >> Thanks all for your suggestions. I have to choose another name for sure. >> Marc::Moose seems to be a reasonable choice. But I'm very tempted by a >> shorter option: MarcX, MarcX::Record, MarcX::Parser, MarcX::Reader::Isis, >> etc. Any objection? Since MARC is an acronym, I believe all of its letters should be capitalized. Trying to remember to lowercase some of them while coding would make me less likely to want to use your modules. As for adding another top level instead of keeping MARC:: as the primary prefix for the modules, since the modules you are working on seem to be dealing with manipulating standard MARC records rather than something new called "MarcX", I'd say MARC:: would be the place I'd expect to find such modules. Thursday, November 11, 2010 8:28 AM Dueber, William [dueb...@umich.edu]: >I think we should revisit "Biblio::". Yes, I know MARC isn't used only for >bibliographic data, but it's sure as hell not used to speak of outside the >library/museum world. 'Biblio' might not be perfect, but it's certainly not >misleading in any meanigful way. As mentioned above, MARC::* is where I'd be likely to look for modules related to manipulating MARC records. Maybe it's because I haven't needed any of the Biblio::* modules, but I'd be less likely to look there for MARC manipulation modules. Since the modules under discussion appear to be an alternative to the current standard modules for MARC manipulation, the MARC::Record family, it seems like something within MARC::* would be appropriate (as long as the names don't interfere with the existing modules but instead can be used in cooperation with them). Bryan Baldus bryan.bal...@quality-books.com eij...@cpan.org http://home.comcast.net/~eijabb/
Re: Moose based Perl library for MARC records
Hi, 2010/11/11 Dueber, William : > I think we should revisit “Biblio::”. Yes, I know MARC isn’t used only for > bibliographic data, but it’s sure as hell not used to speak of outside the > library/museum world. ‘Biblio’ might not be perfect, but it’s certainly not > misleading in any meanigful way. It's up to Frédéric, of course, but since nearly all of the current Perl modules used for handling MARC are in the MARC:: namespace, sticking with the precedent will make it easier for somebody searching CPAN to find all of the choices. Anybody from outside librarydom who realizes that they're stuck dealing with MARC records will presumably have seen the records identified as "MARC". Regards, Galen -- Galen Charlton gmcha...@gmail.com
Re: Moose based Perl library for MARC records
I think we should revisit "Biblio::". Yes, I know MARC isn't used only for bibliographic data, but it's sure as hell not used to speak of outside the library/museum world. 'Biblio' might not be perfect, but it's certainly not misleading in any meanigful way. On 11/11/10 10:23 AM, "Galen Charlton" wrote: Hi, 2010/11/11 Frédéric DEMIANS : > Thanks all for your suggestions. I have to choose another name for sure. > Marc::Moose seems to be a reasonable choice. But I'm very tempted by a > shorter option: MarcX, MarcX::Record, MarcX::Parser, MarcX::Reader::Isis, > etc. Any objection? Not from me, but I'm not sure if the CPAN folks will want yet another top-level namespace. Regards, Galen -- Galen Charlton gmcha...@gmail.com
Re: Moose based Perl library for MARC records
Hi, 2010/11/11 Frédéric DEMIANS : > Thanks all for your suggestions. I have to choose another name for sure. > Marc::Moose seems to be a reasonable choice. But I'm very tempted by a > shorter option: MarcX, MarcX::Record, MarcX::Parser, MarcX::Reader::Isis, > etc. Any objection? Not from me, but I'm not sure if the CPAN folks will want yet another top-level namespace. Regards, Galen -- Galen Charlton gmcha...@gmail.com