New version of MARC::Lint available
I have posted a new version of MARC::Lint to CPAN [1]. This version applies the changes found in MARC 21 updates 17 [2] and 18 [3]. [1] http://search.cpan.org/~eijabb/MARC-Lint_1.48/ [2] http://www.loc.gov/marc/up17bibliographic/bdapndxg.html [3] http://www.loc.gov/marc/bibliographic/bdapndxg.html Thank you for your time. Bryan Baldus bryan.bal...@quality-books.com eij...@cpan.org http://home.comcast.net/~eijabb/
RE: Customizing MARC::Errorchecks
On Tuesday, July 17, 2012 4:34 PM, Shelley Doljack [sdolj...@stanford.edu] wrote: I'm playing around with using MARC::Errorchecks for reviewing ebook records we get from vendors. I want to make some modifications to the module, but I find that if I do so in a similar manner described in the tutorial for customizing MARC::Lint, by making a subclass of the module, it doesn't work. Is this not possible with Errorchecks? Indeed, MARC::Errorchecks was not written in the object-oriented style that MARC::Lint uses. Skimming through the code just now (I've not worked with it as regularly as I might like to be able to keep it fresh in my memory), I believe it is essentially a collection of subs with a wrapper sub to call each check--check_all_subs() calls each of the checking subroutines and returns the arrayref of any errors found. When I wrote it I was still early in learning Perl (and while I've gotten better since then, lack of recent practice working with it hasn't necessarily improved my knowledge of the language), so I'm sure it's not the most optimized code possible. check_all_subs() and the POD comments could serve as an index to each of the checks, with the SYNOPSIS showing examples of how to call the individual checks. That said, if you have ideas for additions or changes, or other questions, I welcome hearing about them, either to add to the base module or to help with creating a related module of your own. I do know that I need to get working on the changes required for RDA records, but haven't really even started looking into the challenges those will pose (though that will likely result in a new module or more devoted just to RDA, and will also likely require changes/subclasses to MARC::Lint and MARC::Lintadditions). Also of note, I have a newer version I've just uploaded to CPAN [1] with the following changes (in addition to those listed below, I plan on removing MARC::Lint::CodeData from the Errorchecks distribution and then requiring MARC::Lint, which includes CodeData (to hopefully resolve issues with installing both module packages at the same time due to this file): Version 1.16: Updated May 16-Nov. 14, 2011. Released 7-17-2012. -Removed MARC::Lint::CodeData and require MARC::Lint -Turned off check_fieldlength($record) in check_all_subs() -Turned off checking of floating hyphens in 520 fields in findfloatinghyphens($record) -Updated validate008 subs (and 006) related to 008/24-27 (Books and Continuing Resources) for MARC Update no. 10, Oct. 2009 and Update no. 11, 2010; no. 12, Oct. 2010; and no. 13, Sept. 2011. -Updated %ldrbytes with leader/18 'c' and redefinition of 'i' per MARC Update no. 12, Oct. 2010. Version 1.15: Updated June 24-August 16, 2009. Released , 2009. -Updated checks related to 300 to better account for electronic resources. -Revised wording in validate008($field008, $mattype, $biblvl) language code (008/35-37) for ' '/zxx. -Updated validate008 subs (and 006) related to 008/24-27 (Books and Continuing Resources) for MARC Update no. 9, Oct. 2008. -Updated validate008 sub (and 006) for Books byte 33, Literary form, invalidating code 'c' and referring it to 008/24-27 value 'c' . -Updated video007vs300vs538($record) to allow Blu-ray in 538 and 's' in 07/04. [1] While the CPAN indexer works on that: http://www.cpan.org/authors/id/E/EI/EIJABB/MARC-Errorchecks-1.16.tar.gz , I've also posted the file to my website: http://home.comcast.net/~eijabb/bryanmodules/MARC-Errorchecks-1.16.tar.gz, with text versions of each file visible in: http://home.comcast.net/~eijabb/bryanmodules/MARC-Errorchecks-1.16 # Finally, I meant to mention it on this list earlier, but I've posted a new version of MARC::Lint, 1.45, to CPAN [2], with the current development version (as of now, same as CPAN's version), in SourceForge's Git repository [3]. Updates to that module include: - Updated Lint::DATA section with Update No. 10 (Oct. 2009) through Update No. 14 (Apr. 2012) - Updated _check_article with the exceptions: 'A ', 'L is ' # [2] http://search.cpan.org/~eijabb/MARC-Lint-1.45/ [3] http://marcpm.git.sourceforge.net/git/gitweb.cgi?p=marcpm/marcpm;a=summary I hope this helps, Bryan Baldus Cataloger Quality Books Inc. 1-800-323-4241x402 bryan.bal...@quality-books.com eij...@cpan.org http://home.comcast.net/~eijabb/
RE: MARC::Record / MARC::File::XML bug when fields contain newlines?
On Thursday, January 12, 2012 11:59 AM, arvinport...@lycos.com [mailto:arvinport...@lycos.com] wrote: I could have sworn I have processed MARC records containing newlines with no problems in the past (I.e., not records converted from XML), though I've never tried to validate them with MARCEdit. ... Looks like MARC::Record is doing its job correctly. Perhaps changing MARC::File::XML is in order. MARC::File::USMARC includes a line in sub _next: # remove illegal garbage that sometimes occurs between records $usmarc =~ s/^[ \x00\x0a\x0d\x1a]+//; If I remember correctly, I believe this was added a few years ago in response to similar questions about new lines appearing in records (or after someone experienced problems with new lines and/or end-of-file characters in files of records--the new line removal may have always been there; I think I may have added 1A after finding it in some files I was working with). I'm not familiar with MARC::File::XML to know how it deals with end of line characters. Bryan Baldus Cataloger Quality Books Inc. The Best of America's Independent Presses 1-800-323-4241x402 bryan.bal...@quality-books.com eij...@cpan.org http://home.comcast.net/~eijabb/
RE: Typo in MARC::Record tutorial.
On Sunday, May 15, 2011 4:40 PM, Mike Barrett [coffeeisl...@gmail.com] wrote: In the MARC::Batch example is this line: 5 my $batch = MARC::Batch('USMARC', 'file.dat'); I just found out it should be: 5 my $batch = MARC::Batch-('USMARC', 'file.dat'); Or (based on code in programs I've been using): my $batch = MARC::Batch-new('USMARC', 'file.dat'); ## Bryan Baldus bryan.bal...@quality-books.com eij...@cpan.org http://home.comcast.net/~eijabb/
RE: Moose based Perl library for MARC records
2010/11/11 Frédéric DEMIANS f.demi...@tamil.fr: Thanks all for your suggestions. I have to choose another name for sure. Marc::Moose seems to be a reasonable choice. But I'm very tempted by a shorter option: MarcX, MarcX::Record, MarcX::Parser, MarcX::Reader::Isis, etc. Any objection? Since MARC is an acronym, I believe all of its letters should be capitalized. Trying to remember to lowercase some of them while coding would make me less likely to want to use your modules. As for adding another top level instead of keeping MARC:: as the primary prefix for the modules, since the modules you are working on seem to be dealing with manipulating standard MARC records rather than something new called MarcX, I'd say MARC:: would be the place I'd expect to find such modules. Thursday, November 11, 2010 8:28 AM Dueber, William [dueb...@umich.edu]: I think we should revisit Biblio::. Yes, I know MARC isn't used only for bibliographic data, but it's sure as hell not used to speak of outside the library/museum world. 'Biblio' might not be perfect, but it's certainly not misleading in any meanigful way. As mentioned above, MARC::* is where I'd be likely to look for modules related to manipulating MARC records. Maybe it's because I haven't needed any of the Biblio::* modules, but I'd be less likely to look there for MARC manipulation modules. Since the modules under discussion appear to be an alternative to the current standard modules for MARC manipulation, the MARC::Record family, it seems like something within MARC::* would be appropriate (as long as the names don't interfere with the existing modules but instead can be used in cooperation with them). Bryan Baldus bryan.bal...@quality-books.com eij...@cpan.org http://home.comcast.net/~eijabb/
RE: MARC::Field-subfields function
On Wednesday, September 08, 2010 3:51 PM, Justin Rittenhouse [mailto:jritt...@nd.edu] wrote: I'm relatively new to Perl and very new to the MARC::Record module. I'm trying to use the subfields function (my @subfields = $field-subfields();), but I'm getting an error: Can't use an undefined value as an ARRAY reference at /usr/lib64/perl5/vendor_perl/5.8.8/MARC/Field.pm line 275. I'm not familiar enough with Perl to figure out what the function is actually doing, so I can't figure out if this is a bug or if I missed something in the tutorial. Other functions off of the $field variable work (I can pull the tag, indicator, and as_string functions). It's difficult to say what went wrong without a little more context. In MARC::Lint, to access the subfields of a field, the following code appears fairly frequently to break down the subfields into code+data pairs in an array: #where $field is a MARC::Field object my @subfields = $field-subfields(); my @newsubfields = (); while (my $subfield = pop(@subfields)) { my ($code, $data) = @$subfield; unshift (@newsubfields, $code, $data); } # while ### What does your code look like in the area that is producing the error? Thank you, Bryan Baldus bryan.bal...@quality-books.com eij...@cpan.org http://home.comcast.net/~eijabb/
MARC::Lint 1.44
MARC::Lint's most recent version is maintained in CVS at SourceForge, http://marcpm.cvs.sourceforge.net/viewvc/marcpm/marc-lint/, with less frequent updates to CPAN, when a significant enough number of changes had been made. I have now posted 1.44 to CPAN, which should include MARC updates 8 and 9, as well as some other minor changes. Please let me know of any problems. Thank you, Bryan Baldus bryan.bal...@quality-books.com eij...@cpan.org http://home.inwave.com/eija
RE: [Patch] Escape marc tag/code/indicators in Marc::File::XML
On Wednesday, July 22, 2009 4:10 PM, Galen Charlton [galen.charl...@liblime.com] wrote: Funny you should mention CVS. I have a general question for the MARC/Perl hackers: Ed mentioned a while back moving from CVS to a more modern VCS such as Subversion or (my preference) Git. I'm willing to do the legwork to get the repositories moved. Thoughts? Speaking as a hobbyist programmer, I've only used CVS, and would hope that a move to a different system wouldn't make it a more complicated or difficult to use system. Until last November, my main development machine was (and still would be) a PowerMac 7500/G3 with MacOS 9. When I tried to update SourceForge CVS this May using my Mac, I believe my SSH login failed (it had worked fine in August 2008), so I switched to updating SourceForge CVS using WinCvs on my Windows Vista laptop (Nov. 2008). I'm not sure what changed to prevent the Mac from being able to get a SSH connection to SourceForge, but I chalked it up to being an age thing (SourceForge update making old operating systems obsolete; or some change to SSH that I couldn't figure out how to fix in the MacSSH client; it does seem like it took a little bit of work getting WinCvs set up, as well), and that from now on, the Windows machine will be what I need to use to be able to update anything on SourceForge. So, as long as there is an easy-to-use Windows-based client for the other version control systems, then I probably wouldn't have a problem with switching. Thank you for your time, Bryan Baldus bryan.bal...@quality-books.com eij...@cpan.org http://home.inwave.com/eija
MARC Errorchecks and Lint Module updates
I have updated MARC::Errorchecks in CPAN, releasing version 1.14, and have updated MARC::Lint in CVS on SourceForge. Changes for each are listed below. MARC::Errorchecks changes: Version 1.14: Updated Oct. 21, 2007, Jan. 21, 2008, May 20, 2008. Released May 25, 2008. -Updated %ldrbytes with leader/19 per Update no. 8, Oct. 2007. Check for validity of leader/19 not yet implemented. -Updated _check_book_bytes with code '2' ('Offprints') for 008/24-27, per Update no. 8, Oct. 2007. -Updated check_245ind1vs1xx($record) with TODO item and comments -Updated check_bk008_vs_300($record) to allow leaves of plates (as opposed to leaves, when no p. or v. is present), leaf, and column(s). -Updated test in Errorchecks.t to remove check for LCCN starting with year greater than the current year. This was at 2008, which is no longer later. A test may be implemented in the future that will be less likely to break with the passage of time. MARC::Lint changes: - Updated _check_article with the exception 'A to ' - Updated Lint::DATA section with Update No. 8 (Oct. 2007) Please let me know of any problems, suggestions, etc. Thank you, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
MARC-Lintadditions 1.13 update
I have posted version 1.13 of MARC::Lintadditions to my home page [1]. Changes are listed below. To install, place the .pm file in your Perl's site/lib/MARC, next to MARC::Lint, MARC::Record, etc. Included in the tar.gz file (and unzipped version) is lintadditions.t.pl.txt, a test file that should pass if everything is installed properly. Version 1.13: Updated Oct. 21, 2007. Released Oct. 21, 2007. -Updated check_100 (and by call, all check_1xx, check_7xx, and check_8xx): --Non-numeric reduced from non-digits to [0-5, 79], since 6 and 8 follow different rules. --Added check for punctuation preceding $e. -Updated check_260, check_440, and check_490 to deal with subfield 6 being 1st when checking for subfield a as first subfield. [1] http://home.inwave.com/eija/bryanmodules/ Please let me know of any problems, corrections, suggestions, or questions. Thank you, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
MARC::Lint and Errorchecks updated on CPAN
I've posted updated versions to CPAN of MARC::Lint (v. 1.43) and MARC::Errorchecks (v. 1.13). I've also uploaded new versions of MARC::Lintadditions (v. 1.12) and a stand-alone copy of MARC-Lint-CodeData (v. 1.18) to my personal home page [1]. Note: MARC::Lintadditions is provided as a stand-alone module and must be installed manually (copy the .pm to the MARC:: folder, next to Lint, Record, Errorchecks, etc.). I still hope to integrate most of its checks into MARC::Lint, but progress so far has been rather slow due to other projects. Other notes: The version of MARC::Lint::CodeData provided with Lint and Errorchecks should be identical. I've experienced difficulty installing both modules through PPM on Windows, perhaps due to CodeData being included with both modules. Changes for each appear below: MARC::Lint: 1.43Wed October 3 19:36:00 CDT 2007 [THINGS THAT MAY BREAK YOUR CODE] - Updated Lint::DATA section with Update No. 7 (Oct. 2006) - MARC::Lint is incompatibile with Business::ISBN versions 2.00-2.02_01. Business::ISBN versions below 2 and 2.02_02 or above should work. - Updated check_record's treatment of 880 fields. Now if the tagno is 880, check_record attempts to look at subfield 6 for the linked tagno and uses that as the basis for the tagno to be checked. - Updated _check_article to account for 880, using subfield 6 linked tagno instead. - Updated _check_article to account for articles followed parentheses, apostrophes and/or quotes. Also related bug fixes for counting punctuation around the article. - For subfield 6, it should always be the 1st subfield according to MARC 21 specifications, so check_245 has been updated to account for subfield 6 being 1st, rather than requiring subfield a to be 1st. - Added new test, test880and6.t for 880 field and for subfield 6. - Added TODO concerning subfield 9. This subfield is not officially allowed in MARC, since it is locally defined. Some way needs to be made to allow messages/warnings about this subfield to be turned off. - Added TODO concerning subfield 8. This subfield could be the 1st or 2nd subfield, so the code that checks for the 1st few subfields (check_245, check_250) should take that into account. - Updated MARC::Lint::CodeData with most recent version. ### MARC::Errorchecks: Version 1.13: Updated Aug. 26, 2007. Released Oct. 3, 2007. -Uncommented valid MARC 21 leader values in %ldrbytes to remove local practice. Libraries wishing to restrict leader values should comment out individual bytes to enable errors when an unwanted value is encountered. -Added ldrvalidate.t.pl and ldrvalidate.t tests. -Includes version 1.18 of MARC::Lint::CodeData. ### MARC::Lintadditions: Version 1.12: Updated Mar. 1-Aug 26, 2007. Released Oct. 3, 2007. -Updated check_042 with new code, ukblderived, from Technical Notice for Aug. 13, 2007. -Updated check_042 with new code, scipio, from Technical Notice for Mar. 1, 2007. -Updated check_xxx methods (check_250) to account for subfield '6' as 1st subfield. ### MARC::Lint::CodeData.pm: Versions 1.15 to 1.18: Updated Feb. 28, 2007-Aug. 14, 2007. -Added new source codes from Technical Notice of Aug. 13, 2007. -Added new source codes from Technical Notice of July 13, 2007. -Added new source codes from Technical Notice of Apr. 5, 2007. -Added new country and geographic codes from Technical Notice of Feb. 28, 2007. -Added 'yu ' to list of obsolete codes. ### [1] http://home.inwave.com/eija/bryanmodules/ Please let me know of any problems, corrections, or suggestions. Thank you for you assistance, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
MARC::Record record length and leader bug?
A recent posting on the OCLC-CAT discussion list (Help needed de-duping, editing and exporting a raw MARC file with Connexion) mentions difficulty the poster is experiencing with using records with too many 949 fields for MarcEdit to load. This led me to attempt to create a test MARC file to see if I could replicate the problem. In the process, I believe I may have found a problem with the way MARC::Record updates the leader for the record length. Starting with a file containing a minimal raw MARC record (leader, 001 of '1', 008, and 245 of '.'), I ran the file through the loop: while (my $record = $batch-next()) { for my $fieldno (0..4810) { #where 4810 was the approximate number of fields needed to push the record length past 9 my $new_field = MARC::Field-new('949', '', '', a = $fieldno); $record-append_fields($new_field); } #for fields print OUT $record-as_usmarc(); #where OUT is an export file previously opened } # while The output file shows the start of the leader as 100032pam 22577931. MARC::Record::set_leader_lengths has a line substr($self-{_leader},0,5) = sprintf(%05d,$reclen);. Is this supposed to limit the $reclen to 5 characters, or does sprintf %05d simply append the necessary 0s to make sure the length is at least 5 digits? Since a record length over 9 is impossible, it might be good to have MARC::Record complain about exceeding the record size limit if the $reclen 9, and to not exceed 5 characters when setting the record length. Please correct me if I am wrong. Thank you for your assistance, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: Removing duplicate fields with MARC::record
Note: my comments are untested and may not work without modification. Some parts left to the reader to complete. On Monday, July 30, 2007 2:16 PM, Michael Bowden wrote: @m856 = sort {$a cmp $b} @m856; @m856 has MARC::Field objects. Comparing them as such are unlikely to produce desired results. better might be @m856 = sort {$a-as_usmarc() cmp $b-as_usmarc()} @m856, but then you lose the field object. Better might be to leave out that step and go on to: my %seen = (); my @new856 = (); Instead of going through all fields in the record, you could go through the 856s you have gathered, add them to the %seen hash as usmarc (to facilitate comparisons), and, as subsequent ones are already seen, delete the field. After that, you could sort the fields, delete them, and then add back the sorted fields. if (@m856) { foreach $f (@m856) { #add this field to seen fields if not seen unless ($seen{$f-as_usmarc}){ $seen{$f-as_usmarc} = $f; }#unless seen this field's exact data else { #seen it, so delete current $record-delete_field($f); } #else seen this field } #foreach 856 my @new856 = (); #add values of %seen, sorted according to keys of %seen ###sort remaining/deduplicated 856 fields, delete existing fields, and then add sorted fields back. ###where @new856 contains the values of %seen, sorted according to the keys of %seen $record-insert_fields_ordered( @new856 ); # I hope this helps, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: Using MARC::Record to delete fields
On Monday, July 16, 2007 8:41 AM, Michael Bowden wrote: Our MARC records have several 035 fields. I want to delete all of the 035s except for the 1st one. I've modified your code below, removing the foreach field loop. The modified code remains unfinished, as I'll leave it to you to determine the best way to remove $first035 from @m035. while (my $record = $batch-next()) { #get first 035 to retain my $first035 = $record-field('035'); #get all 035s my @m035 = $record-field('035'); ### ###remove 1st 035 from @m035 array using array manipulation techniques ### #remove remaining 035s $record-delete_field(@m035); print $record-as_formatted(),\n\n; } I hope this helps, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: conditional statement: indicator not blank
On Monday, April 30, 2007 2:18 PM, Corey Harper wrote: I tried the following variations to no avail: * != '' (no space) * != ' ' (with space) * != undef * != null * != '#' * !- '_' I ended up having to use the following, which achieved the desired effect with any of the above in the first slot: if ($field_7xx-indicator(2) != '' || $field_7xx-indicator(2) == 0) { Would it not be better to use string comparison operators, ne and eq, since indicators may not necessarily be numeric? Please correct me if I am wrong. Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: trailing $ on Cat source 040 field
On Friday, March 30, 2007 12:29 PM, Jackie Shieh wrote: I was working with a group of records that the cataloging source for some of them ends with a dollar sign (see attached records): 040 $aEL$$bspa$cEL$$dDLC 040 $QD$$cQD$ Sometimes, it helps to convert MARC file to MARCMaker format to review. When I do, curiously, MARC::Record is not able to read it back in as_usmarc as the trailing dollar immediately followed by another subfield has caused MARC::Record treat it as an empty subfield. When I open the .mrc file you attached, I see that the 040 reads EL\x1F$bspa\x1FcEL$. In other words, you have a subfield $ instead of subfield b. The .mkr file you attached has EL$$bspa$cEL$. When I convert the .mrc file into a .mrk using MarcEdit, I get: EL${dollar}bspa$cEL{dollar}, which is technically how the field appears. If I convert the file with MARC::File::MARCMaker [1], I get: EL$$bspa$cEL{dollar}. This points to a possible bug in MARCMaker, in that it makes it impossible to reverse the process and produce an identical .mrc file. I believe the reason it didn't change the dollar sign to {dollar} in the 1st instance is because it only converts subfield data, not subfield codes (assuming that the codes will be characters not needing to be escaped. This may be a flawed assumption.). Editing the 040 to have the dollar sign in subfield a followed by the delimiter character produces EL{dollar}$bspa$cEL{dollar} when using both MarcEdit and the Perl module. [1] Latest version on http://marcpm.cvs.sourceforge.net/marcpm/marc-marcmaker/, CPAN version on http://search.cpan.org/~eijabb/MARC-File-MARCMaker-0.05/. SourceForge version has recently updated mrc2mkr and mkr2mrc programs in bin/. I hope this helps, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: MARC::Batch
On Monday, February 26, 2007 11:38 AM, John D Thiesen wrote: This is presumably an obvious question to most of you, but where do I get MARC::Batch? MARC::Batch is included as part of the MARC::Record distribution. Version 2.0.0 was recently (Jan. 25, 2007) released on CPAN: http://search.cpan.org/~mikery/MARC-Record-2.0.0/ The development/most recent version is available in CVS on SourceForge: http://marcpm.cvs.sourceforge.net/marcpm/marc-record/ (http://sourceforge.net/cvs/?group_id=1254) I hope this helps, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: Module update for MARC::Record
I think, because of the number and size of the changes involved, it would be good to stamp the next version of MARC::Record as 2.0.0. I very much support its release as v. 2.0.0 (or anything starting with 2). This distinguishes the new versions requiring modern Perl (post-5.8.0) from the earlier versions. I haven't used v. 2.x much, but it doesn't seem to be causing problems in the limited uses I've had with it (but then I'm lucky enough not to need Unicode at the moment). The main problems I've experienced have just had to due with the initial installation and updating due to CPAN/PPM vs. SourceForge versions. Thank you, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
Re: MARC21 record to CIP block
At 4:22 PM -0700 7/13/06, [EMAIL PROTECTED] wrote: Hi there, However, can anyone here tell me of any tool or tutorialof how to create a CIP block out of a MARC 21 record? I have posted a module, MARC::PCIPmaker.pm to: http://home.inwave.com/eija/inprocess/. It should work to produce an ISBD/PCIP-block display from a raw MARC record (even one coded as full-level, though it is designed for CIP-level records). The code is based on a straight translation of code, originally in Visual Basic (with which I am rather unfamiliar, so the translation may be less than perfect), from the Library of Congress' CIP program. The module is available in tar.gz or directly (with txt extension added for download/viewing on Web), but I haven't had a chance to create the necessary installation files, so it will need to be manually added to the same folder/directory as MARC::Record to be used. Synopsis: #open MARC file, get MARC::Record/MARC::Batch object #$PCIPrecord = $batch-next in while loop, as generally constructed for most other MARC reading Perl programs #convert MARC::Record into raw MARC (ISO 2709) format string my $record_as_marc = $PCIPrecord-as_usmarc(); my ($PCIPblock, @errorsinPCIP) = MARC::PCIPmaker::makecard($record_as_marc); if (@errorsinPCIP) { print The following errors were found in generating the PCIP block\n, join \n, @errorsinPCIP, \n; } #if errors else { print OUT $PCIPblock; }#else no errors I hope this helps. Please let me know of any problems or suggestions. Bryan Baldus [EMAIL PROTECTED] http://home.inwave.com/eija
RE: MARC21 record to CIP block
I've been working on code to produce a CIP (P-CIP in my case) block from a MARC record, using a very literal translation of Visual Basic code into Perl. Currently, it should be able to produce the datablock, but does not yet insert line breaks/formatting. The module is currently tailored for QBI's PCIP, but if I have a chance, I may post some of the code to my site this weekend. Example of input and output: MARC (MARCMaker format provided for readability): =LDR 00880nam 22002898a 4500 =001 qbi02200951\ =002 006bb =003 IOrQBI\\ =005 20030103071854.0 =008 021205s2003iluabf\\\b001\0deng\d =010 \\$a 200199 =020 \\$a199649 =037 \\$a$bQBI =040 \\$aIOrQBI$cIOrQBI =999 \\$aPCIP for QBI Web pages =050 \4$aBF575.H27$bS65 2002 =082 04$a158.1$221 =100 1\$aSmith, Rob$q(Robert Bobbie Bob),$d1966- =245 14$aThe library, the phonebook, and the philosophical origins of happiness /$cby Rob Smith and Bob Jones. =250 \\$a1st ed. =263 \\$a03-- =300 \\$ap. cm. =504 \\$aIncludes bibliographical references and index. =650 \0$aHappiness. =650 \0$aLibraries$xPsychological aspects. =650 \0$aTelephone$vDirectories$xPsychological aspects. =700 1\$aJones, Bob$q(Bob Robert Rob),$d1981- becomes: Smith, Rob (Robert Bobbie Bob), 1966- The library, the phonebook, and the philosophical origins of happiness / by Rob Smith and Bob Jones. -- 1st ed. p. cm. Includes bibliographical references and index. LCCN 200199 ISBN 1-996-4-9 1. Happiness. 2. Libraries--Psychological aspects. 3. Telephone--Directories--Psychological aspects. I. Jones, Bob (Bob Robert Rob), 1981- II. Title. BF575.H27S65 2002 158.1--dc21 qbi02200951 -- Bryan Baldus [EMAIL PROTECTED] http://home.inwave.com/eija
Re: MARC::Lint bug?
At 6:04 PM -0400 6/16/06, Edward Summers wrote: On Jun 16, 2006, at 1:27 PM, Bryan Baldus wrote: MARC::Lint has been revised in SourceForge CVS so that $rules-{$repeatable} is now $rules-{'repeatable'} for field repeatability. Are you able to push this out to CPAN? //Ed I'll try to produce a CPAN upload this week. Thank you, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: MARC::Lint bug?
On Thursday, June 15, 2006 9:19 AM, I wrote: I think I may have discovered a bug in the way MARC::Lint parses tag data. [or sets rules for repeatability of fields vs. allowed subfields.] MARC::Lint has been revised in SourceForge CVS so that $rules-{$repeatable} is now $rules-{'repeatable'} for field repeatability. Please let me know of any problems. Thank you, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
MARC::Lint bug?
I think I may have discovered a bug in the way MARC::Lint parses tag data. In _parse_tag_rules: my $rules = ($self-{_rules}-{$tagno} ||= {}); $rules-{$repeatable} = $repeatable; then: for my $line ( @lines ) { my @keyvals = split( /\s+/, $line, 3 ); my $key = shift @keyvals; my $val = shift @keyvals; # Do magic for indicators if ( $key =~ /^ind/ ) { $rules-{$key} = $val; #} I think having $rules-{$repeatable} and $rules-{$key} (where $key is the subfield code and $repeatable is passed in from the tagno_repeatability_description line of the tag data) is causing $repeatable to be added as an allowable subfield code. I discovered this when wondering why an 082 $a[B]$ROU$214 did not report an error. I plan on looking at this tonight or this weekend. What would you suggest as the best way to resolve this problem? My current line of thinking would have me revising $rules-{$repeatable} to $rules-{'repeatable'}, and leaving the subfields as $rules-{$key}. Does this sound reasonable? Thank you for your assistance, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
Module updates and HTML parsing question
I have recently posted updates to several modules, as described below (and at http://home.inwave.com/eija/). As mentioned below I'm trying to write something to parse the lists of updated name authority records with closed dates (posted regularly on OCLC's website http://www.oclc.org/rss/feeds/authorityrecords/default.htm). I don't have much experience working with HTML/XML, so I welcome any suggestions you may have on the best way to parse these files into a plain text, non-Unicode, tab-separated file of old_heading \t new_heading pairs. I am not able to install any modules that require compiling, and would like the solution to work on Mac (Classic) and Windows platforms without having to be concerned much about character encodings. My plan is to bring each .htm file up in my Web browser (IE), and then save as a Web page, HTML only, with the default Unicode (UTF-8) encoding. After saving the files into a directory, the parsing program will look at each .htm file, pull out the changed names, and put them into the single plain text file described above. Thank you for any assistance you may be able to provide, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija # Module updates: MARC::Lint.pm: (in CVS on SourceForge; not yet updated on CPAN) -DATA section updated recently to MARC Update no. 6 (Oct. 2005). # MARC::Errorchecks.pm: (posted to CPAN and on my site) Version 1.11: Updated June 5, 2006. Released June 6, 2006. -Implemented check_006($record) to validate 006 (currently only does length check). -Revised validate008($field008, $mattype, $biblvl) to use internal sub for material specific bytes (18-34) -Revised validate008($field008, $mattype, $biblvl) language code (008/35-37) to report new 'zxx' code availability when ' ' (3-blanks) is the code in the record. -Added 'mgmt.' to %abbexceptions for check_nonpunctendingfields($record). # MARC::Lint::CodeData.pm: (Most current version is available through CVS on SourceForge with MARC::Lint. Also included in MARC::Errorchecks) -Versions 1.05-1.08 were updated with additions of codes from technical notices. # Lintadditions.pm: (available on my site only, since I'm still trying to merge most of these checks into MARC::Lint, once I find time to write tests for each.) Version 1.10: Updated Oct. 17, 2005-May 18, 2006. Released June 6, 2006. -Added check_024() for UPC and EAN validation. Uses Business::Barcode::EAN13 and Business::UPC for these checks. -check_042() updated with valid source codes from MARC list for sources. -check_050() updated to report cutters not preceded by period. -Misc. bug fixes, including turning off uninitialized warnings for short 007 bytes. # MARC::Global_Replace.pm: (available only on my site, still in pre-alpha stage, so in /inprocess/ rather than /bryanmodules/) Version 0.05--Updated May 1, 2006. Released June 6, 2006. -Revised identify_changed_hdgs($field, \%heading_data, \%changed_hdgs_sub_a) attempting to resolve problem of closed dates vs. open. Version 0.04--Updated Feb. 13, 2006. Unreleased -Modified identify_changed_hdgs($field, \%heading_data, \%changed_hdgs_sub_a) to not report headings where new and old are identical. -Need to strip ending periods for match to work!! -Testing needed for sears heading changes--currently appears to fail to match # Script updates: (available only on my site, still in pre-alpha stage, so in /inprocess/) LCSHchangesparserpl107.txt Version 1.07: Updated May 8, 2006 -Revised changed heading regex to include \ (e.g. ATT) Version 1.06: Updated Oct. 5, 2005 -Added 682 parsing -New_tag is set to 682 when headings are extracted from that field -Global_Replace will need to take these into account during parsing and comparison, since there is a chance that the parsing done by this script will produce unexpected/unreliable results. -682 parsing is incomplete and will likely fail on headings with qualifiers. Version 1.05: Updated Aug. 25, 2005 -Revised parsing to account for some lines previously counted as bad. # parsedeathdateslists.pl.txt (available only on my site, in pre-pre-alpha stage, so in /inprocess/) No version. Very preliminary test code. -Help needed in stripping entities other than subfield delimiter. -Help needed in selecting best HTML/XML parser for OCLC's closed dates lists. -Requires pure Perl solution (I have no ability to use a compiler or to install extra, non-Perl programs, so only modules that came with Perl 5.6 or 5.8.0 or that are simply pm files for the site/lib directory) -Cross-platform capable, non-Unicode/capable of stripping non-ASCII characters without worrying about Mac (Classic) vs. Windows character sets. # #
RE: Question about MARC::RECORD usage
On Wednesday, May 03, 2006 9:28 AM, Ed @ Go Britain wrote: In the 245 record it is possible to have numerous $n and $p fields which need to be output with formating between the fields. My knowledge of PERL isn't too good and I'm struggling to know how to extract these repeated subfields and place formatting between the subfields in the prescribed order $a, $b, $n, $p, $c. Both n and p could be repeated several times. There are times when the proper order would be $a, $n, $p, $b, $c, as well, aren't there? At the moment I take each field into a variable eg $Field245c = $record-subfield('245','c'); and then output these as follows if ($Field245c) { $EntryBody = $EntryBody . -- . $Field245c; } However, this approach assigns the first occurance of a subfield and I haven't yet discovered a tachnique for accessing further subfields. According to the POD in MARC::Field: Or if you think there might be more than one you can get all of them by calling in a list context: my @subfields = $field-subfield( 'a' ); Alternatively, get all subfields in the field and parse as needed: my $field245 = $record-field('245'); my @subfields = $field245-subfields(); while (my $subfield = pop(@subfields)) { my ($code, $data) = @$subfield; #do something with data #or add code and data to array unshift (@newsubfields, $code, $data); } # while ### I hope this helps, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: Deleting a subfield using MARC::Record
OK -- here's the call for a vote. All interested perl4lib members are encouraged to participate by emailing the list. +1 Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
Re: Slowdown when fetching multiple records
At 4:14 PM + 2/19/06, Tony Bowden wrote: So, it presumably is an issue with the Library of Congress server. Is there some sort of automatic throttling there? Or is there likely to be some sort of option that I should be setting, but not? I've not used Perl-based Z39.50 searching, but I think LC requires a 6-second pause between searches, to reduce burden on their servers. While searching using MarcEdit 4.6 (or pre-5), we were locked out for the day for not following this new (Oct. 2005?) rule. MarcEdit 5 beta allows a 6-second pause for servers that might require it, like LC's. I don't know how the Perl modules handle this pause for batch searching LC, but it might be why you are experiencing a delay. I hope this helps, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: MARC.pm unblessed reference
-Original Message- From: Aaron Huber [mailto:[EMAIL PROTECTED] Sent: Monday, November 21, 2005 10:33 PM To: perl4lib@perl.org Subject: MARC.pm unblessed reference Hi All, I am a complete newbie to this and have been testing out MARC.pm. I'm trying to return just the ISBN values from a group of MARC records. It works fine when I specify the record number, but when I put it through the loop it returns the above error. My recommendation would be that you try to switch from MARC.pm to the MARC::Record distribution (see a previous posting to this list http://www.nntp.perl.org/group/perl.perl4lib/2166). Then the code below should accomplish what you want (though I have not tested it as it appears in this message--see http://home.inwave.com/eija/fullrecscripts/Extraction/extractisbn.txt for the original version, for which you would need to also install my MARC::BBMARC (from my website or MARC::Errorchecks from CPAN, which includes that module)). In my modified version below, I extract only subfield 'a' of the 020, and I clean the field to (hopefully) leave only the ISBN portion of the field--removing any qualifiers. It should be easy enough to modify the code to extract the entire 020 field as_string() and report it. I hope this helps, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija #!perl =head2 Extracts ISBN from file of MARC records. =cut ### ### Initialize includes ### ### and basic needs ### ### use strict; #MARC::Batch is installed as part of MARC::Record use MARC::Batch; # ### Start main program ## # my $inputfile= 'marc.dat'; #my $exportfile = 'out.dat'; #open(OUT, $exportfile) or die Problem opening $exportfile, $!; #initialize $batch as new MARC::Batch object my $batch = MARC::Batch-new('USMARC', $inputfile); ## Start extraction # my $runningrecordcount=0; Start while loop through records in file # while ( my $record = $batch-next() ) { $runningrecordcount++; #get control number for reporting #my $controlno = $record-field('001')-as_string() if ($record-field('001')); ### loop through each 020 field ### for my $field020 ( $record-field(020) ) { my $isbn = $field020-subfield('a') if ($field020-subfield('a')); if (defined ($isbn)) { #remove any hyphens $isbn =~ s/\-//g; #remove nondigits $isbn =~ s/^\D*(\d{9,12}[X\d])\b.*$/$1/; # Now report it print $runningrecordcount, : , $isbn, \n; } # if isbn defined } # for } # while ## ### Main program done. ## ## # ### END OF PROGRAM ## #
MARC-File-MARCMaker to CPAN
Version 0.05 of MARC::File::MARCMaker has been released to CPAN (http://search.cpan.org/~eijabb/MARC-File-MARCMaker-0.05/). It has no internal changes from version 0.04, previously mentioned as being uploaded to SourceForge, but is simply a version update for initial CPAN release. Also, I've updated MARC::Doc::Tutorial.pod in CVS on SourceForge (http://cvs.sourceforge.net/viewcvs.py/marcpm/marc-record/lib/MARC/Doc/Tutor ial.pod?rev=1.30view=log) with a section on MARCMaker. Thank you, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: MARC-File-MARCMaker in CVS
On Monday, October 24, 2005 9:53 PM, Edward Summers wrote: Are you planning to release MARC::File::MARCMaker to CPAN? I'll plan on finding time to do so this weekend, unless there are objections/reasons for not uploading. Also, it might be worthwhile adding a section to MARC::Doc::Tutorial if you have the energy. I'll look into doing this. Thank you, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija p.s. My announcement message should have read version 0.04 rather than version 0.4.
marclint update
I have updated the marclint program in CVS on SourceForge to report errors encountered during the decoding process from raw MARC to MARC::Record objects. I also changed tabs to 4 spaces. Please let me know if this causes problems with anything. Thank you, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: MARC::Record elementary question?
When I try to run a program just to read through the file and display the 082 field of each successive record, I get this message: Bareword qr not allowed while strict subs in use at C:\progra~1\perl\lib\MARC/Record.pm line 209. I think this is a bug in v. 1.38 of the MARC::Record module. It appears to be fixed in version 1.39_02 (which, though a developer release, seems stable for how I've been using it--non-unicode records). Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
marclint and decoding errors
Recently, I was presented with a file of MARC 21 (ISO 2709) records containing indicators of hex 00. When these are decoded by MARC::Record, they generate error messages, from MARC::Field-new(): Invalid indicator \x00 forced to blankInvalid indicator \x00 forced to blank If this message had come from MARC::File::USMARC, it would have read Invalid indicators \$indicators\ forced to blanks $location for tag $tagno\n. Can the warning message from MARC::Field be updated with the tagno to help with identifying which field has the problem? Also, I'm considering revising the marclint program included in bin/ of the MARC::Lint distribution to include any decoding errors encountered, as seen below. Is there anything I should consider before doing this, or, would this cause problems for anyone using marclint? If this is not a problem, I'll also change the tabs to 4 spaces for indentation. (changes indicated with # at the end of the line) while ( my $marc = $file-next() ) { if ( not $marc ) { warn $MARC::Record::ERROR; ++$errors{$filename}; } else { ++$counts{$filename}; } #store warnings in @warningstoreturn #+ my @warningstoreturn = (); #+ #retrieve any decoding errors #+ #get any warnings from decoding the raw MARC #+ push @warningstoreturn, $marc-warnings(); #+ $linter-check_record( $marc ); #add any warnings from MARC::Lint #+ push @warningstoreturn, $linter-warnings; #+ if ( @warningstoreturn ) { #revised print join( \n, $marc-title, @warningstoreturn, #revised , , ); ++$errors{$filename}; } } # while ## Thank you, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
MARCMaker etc. updates
I have updated my site with some in-process modules and a new version of the LCSH changes parser script (in http://home.inwave.com/eija/inprocess/). MARC::File::MARCMaker.pm is an early version of a module to convert files to and from MARCMaker format (as used by MarcEdit and the LC tools http://www.loc.gov/marc/makrbrkr.html. Much of the code was originally part of the MARC.pm module (character conversion is essentially used unmodified from that module). The current version appears to successfully convert files both ways, but has not been fully tested. A future version of the distribution should include a program similar to marcdump (from MARC::Record)--1 or 2 programs to convert records both to and from the format. MARC::Global_Replace.pm is a very early-stage version of a module to facilitate global subject heading changes. At present, it appears to successfully identify changed headings in MARC records (using the included global_replace_ident.pl script http://home.inwave.com/eija/inprocess/MARC-Global_Replace0.03/bin/global_re place_ident.txt (.txt for download)), but has not really been tested in any serious way. The LCSH changes parser script http://home.inwave.com/eija/inprocess/LCSHchangesparserpl104.txt creates a file (or set of files), allhash.txt, which is used by MARC::Global_Replace. It takes a folder of LCSH weekly lists (saved as text from the LC site http://www.loc.gov/catdir/cpso/ and produces files of the changed headings (along with a bad.txt file containing headings not yet accounted for by the script). I have also posted a new version of MARC::Errorchecks (1.09) to CPAN. Changes are listed there and on my site. I welcome any comments and suggestions (to [EMAIL PROTECTED]). Thank you, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
Errorchecks etc. updates
(Resending--with apologies for duplication or bouncing) I have posted updates to several of my MARC-related modules and scripts. Changes to each of the main files are listed below. The files have been posted to my home page (http://home.inwave.com/eija). Errorchecks (MARC::Errorchecks) is also available on CPAN. CodeData (MARC::Lint::CodeData) is included as part of Errorchecks, as well as part of MARC::Lint (in SourceForge CVS). MARC::File::MARCMaker is a very preliminary version. I welcome any assistance in improving it. I haven't had time yet to develop working tests or test files, though I do have those used for MARC.pm. The LCSH changes parser script will eventually be used with a new module I'm working on, tentatively named MARC::Global_Replace. The module is intended to automatically update headings from their old form to the newest form. Currently it is almost capable of identifying subfield 'a' 6xx headings that have an old heading in the LCSH weekly lists. Module updates: Lintadditions.pm: Version 1.09: Updated Mar. 31-Apr., 2005. Released July 16, 2005. -check_260() updated to report error if subfield 'a' and 'b' are not present. -More '==' etc. changed to 'eq' etc. for indicators. -check_082() updated to set $dewey to empty string if no 082$a is present before checking for 3 digits. Errorchecks.pm: Version 1.08: Updated Feb. 15-July 11, 2005. Released July 16, 2005. -Added 008errorchecks.t (and 008errorchecks.t.txt) tests for 008 validation -Added check of current year, month, day vs. 008 creation date, reporting error if creation date appears to be later than local time. Assumes 008 dates of 00mmdd to 70mmdd represent post-2000 dates. --This is a change from previous range, which gave dates as 00-06 as 200x, 80-99 as 19xx, and 07-79 as invalid. -Added _get_current_date() internal sub to assist with check of creation date vs. current date. -findemptysubfields($record) also reports error if period(s) and/or space(s) are the only data in a subfield. -Revised wording of error messages for validate008($field008, $mattype, $biblvl) -Revised parse008date($field008string) error message wording and bug fix. -Bug fix in video007vs300vs538($record) for gathering multiple 538 fields. -added check in check_5xxendingpunctuation($record) for space-semicolon-space-period at the end of 5xx fields. -added field count check for more than 50 fields to check_fieldlength($record) -added 'webliography' as acceptable 'bibliographical references' term in check_bk008_vs_bibrefandindex($record), even though it is discouraged. Consider adding an error message indicating that the term should be 'bibliographical references'? -Code indenting changed from tabs to 4 spaces per tab. -Misc. bug fixes including changing '==' to 'eq' for tag numbers, bytes in 008, and indicators. MARC::Lint::CodeData: Version 1.02: Updated June 21-July 12, 2005. Released (to CPAN) with new version of MARC::Errorchecks. -Added GAC and Country code changes for Australia (July 12, 2005 update) -Added 6xx subfield 2 source code data for June 17, 2005 update. -Updated valid Language codes to June 2, 2005 changes. Module in process: MARC::File::MARCMaker.pm: (zipped and uncompressed as /marc-marcmaker/) Version 0.02: Updated July 12-13, 2005. Released July 16, 2005. -Preliminary version of encode() for fields and records -Appears to work when no special chars are present (including dollar signs). -See TODO.txt and readme0.02.txt for list of items still needing to be done and other notes. -Note: This is a pre-alpha release, and little testing has been done on the results of decode() or encode(). Added and changed scripts: Updated LCSH Changes Parser script, LCSHchangesparserpl103.txt: -Now creates files with tab-separated lines: old_tag \t old_hdg \t new_tag \t new_hdg. -Better parsing of weekly files. # I welcome any comments and suggestions. Thank you, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: LC call number sorting utilities
On Sunday, March 27, 2005 7:09 PM, Michael Doran wrote: I recently converted a Library of Congress (LC) call number normalization routine (that I had written for a shelf list application) into a couple of Perl LC call number sorting utilities. Thank you for this. It seems to work well (45000+ numbers sorted, a quick scroll-through seems to show everything sorted correctly). However, as written, it seems to bog down on my machine after a few thousand numbers. Instead of: @input_list = (@input_list, $call_no); and @sorted_list = (@sorted_list, $call_no_array{$key}); perhaps: push @input_list, $call_no; and push @sorted_list, $call_no_array{$key}; might help to speed things up (it did in my case). I hope this helps. Thank you, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: Sort with MARC::Record
Has anyone sorted a file of hundreds of records by 001? I haven't done sorting yet, but you may want to see if my MARC::BBMARC's [1], MARC::BBMARC::updated_record_hash() sub may be of use. It reads a file of MARC records and stores them in a hash with the 001 as key, the raw MARC as value. It should be fairly simple, then, to use this to output the desired records in the proper order. It should work ok on small files of MARC records, but depending on your system's memory, may die a horrible death on large record sets. My extractbycontrolno script [2] reads a file of control numbers (using BBMARC's updated_record_array() to save memory) and a separate file of MARC records, and outputs the matching records. It doesn't do any sorting, so it depends on the order it finds the records. [1] MARC::BBMARC is available directly from my homepage at: http://home.inwave.com/eija/bryanmodules/, or bundled with MARC::Errorchecks on CPAN http://search.cpan.org/~eijabb/MARC-Errorchecks-1.06/ [2] http://home.inwave.com/eija/fullrecscripts/Extraction/extractbycontrolno.txt Hope this helps, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: MARC::Lint update
There were some unitialized variable warnings, nothing serious. 'make test' will run perl under the warnings pragma, so 'use warnings' in your module will help you catch this sort of thing early. I generally 'use warnings' or use the -w flag in the modules and scripts I've been writing. I didn't notice it was missing. I need to add strict and warnings to CodeData, as well. In modules/package files, is it practice to leave out the shebang (#!perl) line, since the file is not generally executed directly? If so, is that the reason for 'use warnings' vs. -w? I don't know what editor you use, but .it's been the norm for marc/perl module folks to not embed tabs in source code for indentation. vim and emacs both support mapping a tab to spaces when you hit the tab key. I use BBEdit Lite, which has a good global search/replace function. In the future, I'll try to remember to convert the indentation tabs to 4 spaces per tab. Are non-indentation tabs ok? In MARC::Lint::CodeData, I used split on \t to split the codes into a hash. Since some codes have or need spaces, splitting on \s would probably not work as well. Thank you for your assistance, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
MARC::Lint update
The SourceForge CVS version of MARC::Lint has been updated with new checks (041, 043), revisions to check_245, a new internal _check_article method, the addition of MARC::Lint::CodeData (for 041, 043, etc.), and 2 new tests. Watch for further added check_xxx methods in the near future, as I move them out of MARC::Lintadditions into MARC::Lint. Thank you, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: Ignoring Diacritics accessing Fixed Field Data
On Tuesday, January 11, 2005 2:13 PM, Michael Doran wrote: Assuming that you asking how to strip out the MARC-8 combining diacritic characters, try inserting the substitution commands listed (as shown below) just prior to the substr commands: my $ME = $field-subfield('a'); $ME =~ s/[\xE1-\xFE]//g; my $four100 = substr( $ME, 0, 4 ); my $TITLE = $field-subfield('a'); $TITLE =~ s/[\xE1-\xFE]//g; my $four245 = substr( $TITLE, 0, 4 ); You might want to change the procedure for getting the title to skip articles (untested, may need corrections): #given $record being the MARC::Record object, and exactly 1 245 field being present, as required by MARC21 rules my $titleind2 = $record-$field('245')-indicator(2); my $TITLE = $field-subfield('a'); $TITLE =~ s/[\xE1-\xFE]//g; my $four245 = substr( $TITLE, 0+$titleind2, 4 ) if $titleind2 =~/^[0-9]$/; #the if statement should be unnecessary, since 245 2nd indicator should always be some number, but just in case. Hope this helps, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
Re: Module to read Isis data
: '579a7c6901c654bdeac10547a98e5b71' not ok 60 - md5 4 # Failed test (RAM Disk:002_isis.t at line 119) # got: 'e605bf7847b50064459fe1071bb8b4df' # expected: '7d2adf1675c83283aa9b82bf343e3d85' not ok 61 - md5 5 # Failed test (RAM Disk:002_isis.t at line 119) # got: '0e27001d65f9a7d7be485c5f13e17bb8' # expected: 'daf2cf86ca7e188e8360a185f3b43423' ok 62 - The object isa Biblio::Isis ok 63 - count is 5 ok 64 - read_cnt ok 65 - returns 2 elements ok 66 - cnt 1 ORDN same ok 67 - cnt 1 ABNORMAL same ok 68 - cnt 1 N same ok 69 - cnt 1 LIV same ok 70 - cnt 1 K same ok 71 - cnt 1 ORDF same ok 72 - cnt 1 FMAXPOS same ok 73 - cnt 1 NMAXPOS same ok 74 - cnt 1 POSRX same ok 75 - cnt 2 ORDN same ok 76 - cnt 2 ABNORMAL same ok 77 - cnt 2 N same ok 78 - cnt 2 LIV same ok 79 - cnt 2 K same ok 80 - cnt 2 ORDF same ok 81 - cnt 2 FMAXPOS same ok 82 - cnt 2 NMAXPOS same ok 83 - cnt 2 POSRX same ok 84 - fetch 1 ok 85 - MFN 1 702:0 ^aHolder^bElizabeth ok 86 - MFN 1 990:0 2140 ok 87 - MFN 1 990:1 88 ok 88 - MFN 1 990:2 HAY ok 89 - MFN 1 675:0 ^a159.9 ok 90 - MFN 1 210:0 ^aNew York^cNew York University press^dcop. 1988 ok 91 - MFN 1 801:0 ^aFFZG ok 92 - fetch 2 ok 93 - MFN 2 215:0 ^aIX, 275 str.^d23 cm ok 94 - MFN 2 200:0 ^aPsychoanalysis and psychology^eminding the gap^fStephen Frosh ok 95 - MFN 2 990:0 2140 ok 96 - MFN 2 990:1 89 ok 97 - MFN 2 990:2 FRO ok 98 - MFN 2 210:0 ^aNew York^cUniversity press^d1989 ok 99 - MFN 2 700:0 ^aFrosh^bStephen ok 100 - fetch 3 ok 101 - MFN 3 200:0 ^aPsychoanalitic politics^eJacques Lacan and Freud's French Revolution^fSherry Turkle ok 102 - MFN 3 990:0 2140 ok 103 - MFN 3 990:1 92 ok 104 - MFN 3 990:2 LAC ok 105 - MFN 3 210:0 ^aLondon^cFree Associoation Books^d1992 ok 106 - MFN 3 700:0 ^aTurkle^bShirlie ok 107 - MFN 3 686:0 ^a2140 ok 108 - MFN 3 686:1 ^a2140 ok 109 - fetch 4 ok 110 - MFN 4 200:0 ^aKey studies in psychology^fRichard D. Gross ok 111 - MFN 4 210:0 ^aLondon^cHodder Stoughton^d1994 ok 112 - MFN 4 10:0 ^a0-340-59691-0 ok 113 - MFN 4 700:0 ^aGross^bRichard ok 114 - fetch 5 not ok 115 - MFN 5 200:0 1\#^aPsychology^fCamille B. Wortman, Elizabeth F. Loftus, Mary E. Marshal # Failed test (RAM Disk:002_isis.t at line 104) # got: 1 # expected: 0 not ok 116 - MFN 5 225:0 1\#^aMcGraw-Hill series in Psychology # Failed test (RAM Disk:002_isis.t at line 104) # got: 1 # expected: 0 not ok 117 - md5 1 # Failed test (RAM Disk:002_isis.t at line 119) # got: 'fbaa4b35c85b289e9fec15ba0f99b14a' # expected: 'f5587d9bcaa54257a98fe27d3c17a0b6' not ok 118 - md5 2 # Failed test (RAM Disk:002_isis.t at line 119) # got: '14f828e2049a5d8523b6301c7009a3fe' # expected: '3be9a049f686f2a36af93a856dcae0f2' not ok 119 - md5 3 # Failed test (RAM Disk:002_isis.t at line 119) # got: '67d92a83434115acd98c4cb28b2784ec' # expected: '3961be5e3ba8fb274c89c08d18df4bcc' not ok 120 - md5 4 # Failed test (RAM Disk:002_isis.t at line 119) # got: 'e605bf7847b50064459fe1071bb8b4df' # expected: '5f73ec00d08af044a2c4105f7d889e24' not ok 121 - md5 5 # Failed test (RAM Disk:002_isis.t at line 119) # got: '0e27001d65f9a7d7be485c5f13e17bb8' # expected: '843b9ebccf16a498fba623c78f21b6c0' ok 122 - deleted found ok 123 - MFN 3 is deleted ok 124 - deleted not found ok 125 - MFN 3 is deleted # Looks like you planned 110 tests but ran 15 extra. ## Hope this helps, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
RE: MARC::Record tests
There is code in MARC::File::MicroLIF::_get_chunk that handles DOS (\r\n) and Unix (\n) line endings, but not Mac (\r). This is true, and it seems to work. Unfortunately, it is not reached by the test, since the test calls decode() directly, instead of going through _next() or _get_chunk. Perhaps the: # for ease, make the newlines match this platform $lifrec =~ s/[\x0a\x0d]+/\n/g if defined $lifrec; in _next() should be moved (or added as duplicate code) to decode() just between the lines: my $marc = MARC::Record-new(); ### $text =~ s/[\x0a\x0d]+/\n/g if defined $text; my @lines = split( /\n/, $text ); Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
MARC::Errorchecks and other updates
I have updated my modules again. Changes are listed below. I have also uploaded MARC::Errorchecks to CPAN [1]. I welcome any comments, questions, or suggestions. I have included MARC::BBMARC in the MARC::Errorchecks CPAN distribution (though I have no idea whether the Makefile.PL will install it automatically). Changes (Oct. 17, 2004) MARC::Errorchecks: Version 1.03: Updated Aug. 30-Oct. 16, 2004. Released Oct. 17. First CPAN version. -Moved subs to MARC::QBIerrorchecks --check_003($record) --check_CIP_for_stockno($record) --check_082count($record) -Fixed bug in check_5xxendingpunctuation for first 10 characters. -Moved validate008() and parse008date() from MARC::BBMARC (to make MARC::Errorchecks more self-contained). -Moved readcodedata() from BBMARC (used by validate008) -Moved DATA from MARC::BBMARC for use in readcodedata() -Remove dependency on MARC::BBMARC -Added duplicate comma check in check_double_periods($record) -Misc. bug fixes Planned (future versions): -Account for undetermined dates in matchpubdates($record). -Cleanup of validate008 --Standardization of error reporting --Material specific byte checking (bytes 18-34) abstracted to allow 006 validation. MARC::Lintadditions: Version 1.05: Updated Aug. 30-Oct. 16, 2004. Released Oct. 17, 2004. -Moved institution-specific code from check_040 to MARC::QBIerrorchecks. --check_040 still present to check $b language (currently commented-out) -Moved check_037 to MARC::QBIerrorchecks. -Updated check_082 to ensure decimal after 3rd digit in numbers longer than 3 digits. -Moved validate007([EMAIL PROTECTED]) from MARC::BBMARC (to make MARC::Lintadditions more self-contained). -Fixed problem in 6xx check for subfield _2 (changed '==' to 'eq'). -Updated validate007([EMAIL PROTECTED]) (bug fixes, misc. revisions) -Updated check_050 to check for unfinished cutters (single capital letter followed by space or nothing) MARC::BBMARC: Version 1.07: Updated Aug. 30-Oct. 16, 2004. Released Oct. 17, 2004. Included with MARC::Errorchecks upload to CPAN. -Moved subroutine getcontrolstocknos() to MARC::QBIerrorchecks -Moved validate007() to Lintadditions.pm -Moved validate008() and related subs to Errorchecks.pm --(Left readcodedata() in BBMARC, but it is now duplicated in Errorchecks.pm, along with a modified version in Lintadditions.pm). --Also left parse008date, which may have uses outside of error checking. -Updated read_controlnos([$filename]) with minor changes. --This subroutine could be rewritten in a more general way, since it simply reads all lines from a file into an array and returns that array. [1] Distribution on CPAN: http://search.cpan.org/~eijabb/MARC-Errorchecks-1.03/ or http://www.cpan.org/modules/by-module/MARC/MARC-Errorchecks-1.03.tar.gz Thank you, Bryan Baldus Cataloger Quality Books Inc. [EMAIL PROTECTED] http://home.inwave.com/eija
RE: DOS EOF character in MARC files
I don't know if it is correct, but I tried to write a test for the DOS EOF removal in MARC::File::USMARC [1]. This is the first test file I've tried to write, so please let me know where I went wrong. From the dosEOFtest.t file description: Checks t/dosEOF.usmarc, which contains one record, which is just sample1.usmarc with \x1a added as a final character. When writing tests, should one take into account the file systems, or is the test end-user expected to deal with this? For example, in order to use dosEOFtest.t under MacPerl, I need to change: my $record_file_name = '/t/dosEOF.usmarc'; to my $record_file_name = ':t:dosEOF.usmarc'; From the previous message: (BTW, when modifying existing code, use the same conventions, in this case \x1a rather than \1A.) Sorry about that. I wasn't sure if case mattered, so I used \x1A (which BBEdit Lite gave me when I copied in the raw character). In MARC::File::USMARC, the constants: use constant SUBFIELD_INDICATOR = \x1F; use constant END_OF_FIELD = \x1E; use constant END_OF_RECORD = \x1D; have capital letters, while the garbage removal uses lowercase (as are 2 in comments in decode(): $usmarc =~ s/^[ \x00\x0a\x0d]+//; my $dir = substr( $text, LEADER_LEN, $data_start - LEADER_LEN - 1 ); # -1 to allow for \x1e at end of directory # character after the directory must be \x1e (substr($text, $data_start-1, 1) eq END_OF_FIELD # [1] http://home.inwave.com/eija/bryanmodules/ , file dosEOFtest.tar.gz or the directory dosEOFtest for uncompressed versions. Thank you, Bryan Baldus Cataloger (Quality Books Inc.) http://home.inwave.com/eija/
New versions of MARC error checking modules
I have once again updated my error checking modules. I believe I have finished adding most of the new checks I wanted, though I have a few in mind still. Among these, rewriting the 007 and 008 validation subroutines, adding 006 validation, additional punctuation checks (before title subfields in 6xx and 7xx), check for underscores in fields (should not appear unless they stand for the subfield delimiter), and geographic coding in 6xx vs. topical coding, using a list of common geographic headings (e.g. if United States appears in 650 subfield a, then error). I am also working on a global subject heading replacement program, in conjunction with my LCSH Weekly List Changes Parser. I am considering distributing my modules (and scripts--probably in the same tar.gz file) on CPAN, once I figure out how to do so (in an easy manner using MacPerl). Before I do this, I am thinking my MARC::BBMARC module may need a new name. Right now it is just named BBMARC for my initials+MARC. It is a collection of functions with little in common, other than that they help my MARC related .pls and .pms function. I should probably move the validate007 and 008 functions to Lintadditions.pm and Errorchecks.pm, making those modules more self-contained. Would it be advisable to change BBMARC's name, and if so, do you have suggestions for a new name? Changes: (Aug. 22, 2004): Module updates: Errorchecks.pm (http://home.inwave.com/eija/bryanmodules/): Version 1.02: Updated Aug. 11-22, 2004. Released Aug. 22, 2004. -Implemented VERSION (uncommented) -Added check for presence of 040 (check_040present($record)). -Added check for presence of 2 082s in full-level, 1 082 in CIP-level records (check_082count($record)). -Added temporary (test) check for trailing punctuation in 240, 586, 440, 490, 246 (check_nonpunctendingfields($record)) --which should not end in punctuation except when the data ends in such. -Added check_fieldlength($record) to report fields longer than 1870 bytes. --This should be rewritten to use the length in the directory of the raw MARC. -Fixed workaround in check_bk008_vs_bibrefandindex($record) (Thanks again to Rich Ackerman). Lintadditions.pm (http://home.inwave.com/eija/bryanmodules/): Version 1.04: Updated Aug. 10-22, 2004. Released Aug.22, 2004. -Implemented VERSION (uncommented) -Revised check_050 exception (Thank you to all who posted about this). -Moved VERSION HISTORY to end of module. -Added preliminary checking of 245 2nd indicator in check_245 (Thanks to Ian Hamilton). BBMARC.pm (http://home.inwave.com/eija/bryanmodules/): Version 1.06: Updated Aug. 10-22, 2004. Released Aug. 15, 2004. -Implemented VERSION (uncommented) -Added subroutine getcontrolstocknos() -General readability cleanup (added tabs) -Bug fix in validate008 for date2 check Planned (next release): -Cleanup of validate008 (and validate007) --Standardization of error reporting --Material specific byte checking (bytes 18-34) abstracted to allow 006 validation. Added and changed scripts: -Updated LCSH Changes Parser script, LCSHchangesparser2.txt (http://home.inwave.com/eija/inprocess/LCSHchangesparser2.txt): --Adds 500 to tag number if it is 1xx, so that it becomes 600-655, in preparation for use in global replacement. --Misc. fixes. -lintwithadditionsselective.txt (http://home.inwave.com/eija/fullrecscripts/lintwithadditionsselective.txt) --Similar to lintwithadditions, but designed to call only specific check_xxx functions in either MARC::Lint or MARC::Lintadditions. --This has been tested only minimally, but may see future use as a basis for test files. -- As usual, I welcome comments, suggestions, questions, etc. Thank you for your assistance, Bryan Baldus Cataloger Quality Books, Inc. [EMAIL PROTECTED] http://home.inwave.com/eija
RE: Warnings during decode() of raw MARC
How do you typically do the install? MARC::Record is included at the ActiveState PPM Repository, so it should do these things on a Windows platform...assuming nmake or some sort of make variant is being used. At home (on the Mac), I just drop the MARC folder in my site_lib folder in the MacPerl folder (MacPerl adds site_perl to @INC automatically, I believe). This is after I expand the tar.gz files with Stuffit Expander. There used to be an installer.pl for Mac, but it no longer works with the current version of MacPerl (5.8.0a). Since the documentation in MacPerl seems to indicate that it is not compatible with MakeMaker?, drag-drop installation seems to be the easiest alternative. To convert line endings, I use either BBEdit Lite, a 3rd party program, or a script I just wrote that should convert line endings and change the Type and Creator (to TEXT and BBEdit). In Windows, I take the folder from home, convert the line endings from Mac to DOS, and then drop the MARC folder in C:\Perl\site\lib\. This (dragging and dropping) seems to work fine for most stand-alone modules (where a C compiler is not needed). In some cases, I do look at the Makefile.PL, for example with MARC::Charset where it was necessary to create a database file of EastAsian character sets. Of course, once I got that installed (through drag-dropping), it gave a number of errors (when I ran the tests), probably because of my operating system (MacOS 9.2.2) not working well with Unicode? I do generally try running each of the test files when I first install a new module, just to make sure they work ok, but I've not usually bothered to look at how the tests or the Makefile.PL work. This is one reason I haven't tried to distribute my modules through CPAN. Bryan Baldus http://home.inwave.com/eija/ (http://home.inwave.com/eija/readme.htm)
MARC error checking with Perl updates and question
I have once again updated by error checking modules [1], (MARC::)Errorchecks .pm and (MARC::)Lintadditions.pm. I am running out of new things to check for, though I do have a few ideas in mind, including attempting to find miscoded geographical headings and topical headings (e.g. if United States appears in a 6xx subfield other than 651$a or 6xx$z, it may be miscoded (though not always), or if Dogs is in 651$a or 6xx$z, it is probably miscoded), as well as the items in the Current planned in progress tasks on my site. I have added a question concerning grep at the end of this message. Thank you for any assistance you may be able to provide. Changes: (Aug. 8, 2004): Module updates: Errorchecks.pm: Version 1.01: Updated July 20-Aug. 7, 2004. Released Aug. 8, 2004. -Temporary (or not) workaround for check_bk008_vs_bibrefandindex($record) and bibliographies. -Removed variables from some error messages and cleanup of messages. -Code readability cleanup. -Added subroutines --check_240ind1vs1xx($record) -- Reports errors based on whether 240 and 1xx are both present and first indicator is 1 or 0. --check_041vs008lang($record) -- Compares first code in subfield 'a' of 041 vs. 008 bytes 35-37. --check_5xxendingpunctuation($record) -- Looks for final punctuation in several of the 5xx fields. --findfloatinghypens($record) -- Looks for space-hyphen-space in each field (in a list of given fields) --video007vs300vs538($record) -- In video records, compares 007 values vs. 300 and 538 fields. Limited to VHS, DVD, and Video CD. --ldrvalidate($record) -- Checks for valid bytes in the user-changable leader bytes. --geogsubjvs043($record) -- Reports missing 043 if 651 or 6xx$z is present. has list of exceptions (e.g. English-speaking countries) --findemptysubfields($record) -- Looks for empty subfields (e.g. $x$xPsychology.) Changed subroutines: -check_bk008_vs_300: --added cross-checking for codes a, b, c, g (ill., map(s), port(s)., music) --added checking for 'p. ' or 'v. ' or 'leaves ' in subfield 'a' --added checking for 'cm.', 'mm.', 'in.' in subfield 'c' --revised check for 'm', phono. (which QBI doesn't currently use) --Added check in check_bk008_vs_bibrefandindex($record) for 'Includes index.' (or indexes) in 504 ---This has a workaround I would like to figure out how to fix Lintadditions.pm: version 1.03: Updated July 20-Aug. 7, 2004. Released Aug. 8, 2004. -Added check_1xx and check_7xx sets. -Added checks for non-filing indicator in 130, 630, 730, 740 and 830. -Added indicator check for 700--ind1 == 3 - error. -Added validation of 041 against MARC Code List for Languages. -Added check_028 and check_037. -Removed some variables from warning messages. -Added check_050. -Added check_040 (IOrQBI specific). -Added check_440 and check_490. -Added check_246. -Changed check_245 ending punctuation errors based on MARC21 rule change vs. LCRI 1.0C from Nov. 2003. -Added check for square brackets in 245 $h. -Added check for 260 ending punctuation. Added and changed scripts: Most of these are test scripts created while writing the subroutines listed above. The subroutines in the modules may have code not in the scripts, so it is best to use the module rather than the script for those checks (the last 3 full record scripts). Full record: -fieldsubfieldcounts.txt -- Field and subfield count--will report totals for each tag and subfield. --First version: Field tag counts only. -testnewerrorchecks.txt -- Test script to call new subroutines in Errorchecks.pm (MARC::Errorchecks). -ldrvalidatescript.txt -- In Errorchecks.pm -viddvdvsvhs.txt -- In Errorchecks.pm. -findemptysubfields.txt -- Looks for empty subfields. Skips 037 in CIP-level records. In Errorchecks.pm. Cleanup: - -find050doubleperiod.txt -- Test regex for finding pattern in 050$a. Preliminary code for MARC::Lintadditions::check_050() -removetitlefromlintrpt.txt -- Removes titles from lintallchecks' output file. -findmissing300apunctuation.txt -- Looks for missing period after p or v in 300a extract file. Initial step for MARC::Errorchecks::check_bk008_vs_300($record) code. - Question: In the following code, is there a more efficient way to write the grep for Includes index(es). to get the same result? ### workaround ### my @indexin504 = grep {$_ =~ /Includes(.*)index(es)?(\.)*/; push @indexalone, $1.$3; $_ =~ /Includes(.*)index(\.)*/;} @fields504; #look for 'Includes index.' in 504 foreach my $indexalonein504 (@indexalone) { #report error if have only space between 'Includes' and 'index' (followed by period) if ($indexalonein504 =~ /^ \.$/) { push @warningstoreturn, (504: 'Includes index' should be 500.) } #if index is alone in 504 } #foreach index alone --- [1] My home page: http://home.inwave.com/eija/ Thank you, Bryan Baldus Cataloger Quality Books, Inc
RE: retain repeatable subfields
I may be wrong (as I am new to Perl), but I believe $r534-subfield('n') is being called in a scalar context, so it retrieves only the first instance of subfield n. Perhaps: my @subfields = $r534-subfields(); my @newsubfields = (); #break subfields into code-data array (so the entire field is in one array) while (my $subfield = pop(@subfields)) { my ($code, $data) = @$subfield; unshift (@newsubfields, $code, $data); } # while would work better? Then parse the array for the desired subfields? Please correct me if I am wrong, Hope this helps, Bryan Baldus Cataloger Quality Books, Inc. The Best of America's Independent Presses [EMAIL PROTECTED]
Perl-based MARC record error checking update and questions
languageand country codes. Version 1.03: Updated June 10, not released. -Contained many of the changes in 1.04, but 1.04 contains the update to validate008, so I wanted a new version. -- [1] My home page: http://home.inwave.com/eija [2] Link to Errorchecks current version: http://home.inwave.com/eija/bryanmodules/MARC-Errorchecks-0.95/Errorchecks.p m.txt (try http://home.inwave.com/eija/bryanmodules/ if the above fails) [3] lintallchecks.pl: http://home.inwave.com/eija/fullrecscripts/lintallchecks.txt [4] Link to Lintadditions current version: http://home.inwave.com/eija/bryanmodules/MARC-Lintadditions-1.01/Linta dditions.pm.txt (try http://home.inwave.com/eija/bryanmodules/ if the above fails) [5] Link to BBMARC current version: http://home.inwave.com/eija/bryanmodules/MARC-BBMARC-1.04/BBMARC.PM.txt (try http://home.inwave.com/eija/bryanmodules/ if the above fails) I welcome any suggestions, questions, and comments (to this address, or to that listed on my site). Thank you, Bryan Baldus Cataloger Quality Books Inc. [EMAIL PROTECTED] http://home.inwave.com/eija
BBMARC updated
I have updated my MARC::BBMARC module. The new version is available at http://home.inwave.com/eija/mac/MARC-BBMARC-1.01/BBMARC.pm.txt (also http://home.inwave.com/eija/unix/MARC-BBMARC-1.01/BBMARC.pm.txt and http://home.inwave.com/eija/win/MARC-BBMARC-1.01/BBMARC.pm.txt). New to this version is validate008, which reads an 008 field (actually a string of bytes) and reports back any invalid characters, in a tab-separated scalar reference. It also returns a hash reference containing named character positions, and a cleaned version of the initial string (probably not useful, since little or no cleaning occurs in the validate008 subroutine). To use the new subroutine, I wrote 008checker.pl.txt (available http://home.inwave.com/eija/mac/templatified/008checker.pl.txt). Other changes to BBMARC were minor, and some are listed in a changes section of the module. I have also updated my home page with information about changes and planned projects (http://home.inwave.com/eija/). I welcome any comments and corrections you may have. Bryan Baldus Cataloger Quality Books, Inc. The Best of America's Independent Presses [EMAIL PROTECTED]
RE: unsubsribe
Subscription information may be found at http://perl4lib.perl.org/, which states: To subscribe, unsubscribe, or contribute to the list use one of the following addresses: [EMAIL PROTECTED] [EMAIL PROTECTED] Hope this helps, Bryan Baldus Cataloger Quality Books, Inc. The Best of America's Independent Presses 1-800-323-4241x460 [EMAIL PROTECTED] -Original Message- From: Holly Bravender [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 25, 2004 10:56 AM To: [EMAIL PROTECTED] Subject: unsubsribe Take me off your list! Thank you. Holly Bravender Reference Instruction Librarian Paul V. Galvin Library Illinois Institute of Technology 35 W. 33rd Street Chicago, IL 60616 www.gl.iit.edu (312) 567-3373 [EMAIL PROTECTED]
MARC-related scripts and code
As you may recall, I am a cataloger with limited programming experience, and I have been teaching myself Perl, using the MARC::Record modules. I have posted the code I have been working on to my (hastily created) home page, at http://home.inwave.com/eija/. One of the files included is BBMARC.pm, which is designed to go in the MARC folder/directory of MARC::Record. This file contains a number of subroutines, including validate007() for checking that each byte of an 007 is within the range of valid values and that there is not extra data after the format's limit. I believe the section on Motion Pictures is unfinished (we don't have any, so I didn't go to the trouble of updating the section), but the logic should follow that in Electronic Resources. Please send me any comments you might have. Thank you, Bryan Baldus Cataloger Quality Books, Inc. The Best of America's Independent Presses [EMAIL PROTECTED]