Re: Removing duplicate fields with MARC::record
MUST ... RESIST ... URGE ... FOR ... ONEUPMANSHIP!!! AAAH!!! ;) ... while (my $record = $batch->next) { # find 'em my @m856 = $record->field('856'); # get rid of 'em $record->delete_field( $_ ) for @m856; # map to a hash for direct uniqueness %u856 = (map { ($_->as_usmarc => $_) } @m856); # then add 'em back $record->insert_fields_ordered( values( %u56) ); } # sorry for the top-posting... and it's untested :) __END__ --miker On 7/30/07, Bryan Baldus <[EMAIL PROTECTED]> wrote: > Note: my comments are untested and may not work without modification. Some > parts left to the reader to complete. > > On Monday, July 30, 2007 2:16 PM, Michael Bowden wrote: > > @m856 = sort {$a cmp $b} @m856; > > @m856 has MARC::Field objects. Comparing them as such are unlikely to > produce desired results. > better might be @m856 = sort {$a->as_usmarc() cmp $b->as_usmarc()} @m856, > but then you lose the field object. Better might be to leave out that step > and go on to: > > > my %seen = (); > > my @new856 = (); > > Instead of going through all fields in the record, you could go through the > 856s you have gathered, add them to the %seen hash as usmarc (to facilitate > comparisons), and, as subsequent ones are already seen, delete the field. > After that, you could sort the fields, delete them, and then add back the > sorted fields. > > if (@m856) { > foreach $f (@m856) { >#add this field to seen fields if not seen >unless ($seen{$f->as_usmarc}){ > $seen{$f->as_usmarc} = $f; >}#unless seen this field's exact data >else { > #seen it, so delete current > $record->delete_field($f); >} #else seen this field > } #foreach 856 > > my @new856 = (); #add values of %seen, sorted according to keys of %seen > ###sort remaining/deduplicated 856 fields, delete existing fields, and then > add sorted fields back. > ###where @new856 contains the values of %seen, sorted according to the keys > of %seen > >$record->insert_fields_ordered( @new856 ); > > # > > I hope this helps, > > Bryan Baldus > [EMAIL PROTECTED] > [EMAIL PROTECTED] > http://home.inwave.com/eija > > >
RE: Removing duplicate fields with MARC::record
Note: my comments are untested and may not work without modification. Some parts left to the reader to complete. On Monday, July 30, 2007 2:16 PM, Michael Bowden wrote: > @m856 = sort {$a cmp $b} @m856; @m856 has MARC::Field objects. Comparing them as such are unlikely to produce desired results. better might be @m856 = sort {$a->as_usmarc() cmp $b->as_usmarc()} @m856, but then you lose the field object. Better might be to leave out that step and go on to: > my %seen = (); > my @new856 = (); Instead of going through all fields in the record, you could go through the 856s you have gathered, add them to the %seen hash as usmarc (to facilitate comparisons), and, as subsequent ones are already seen, delete the field. After that, you could sort the fields, delete them, and then add back the sorted fields. if (@m856) { foreach $f (@m856) { #add this field to seen fields if not seen unless ($seen{$f->as_usmarc}){ $seen{$f->as_usmarc} = $f; }#unless seen this field's exact data else { #seen it, so delete current $record->delete_field($f); } #else seen this field } #foreach 856 my @new856 = (); #add values of %seen, sorted according to keys of %seen ###sort remaining/deduplicated 856 fields, delete existing fields, and then add sorted fields back. ###where @new856 contains the values of %seen, sorted according to the keys of %seen $record->insert_fields_ordered( @new856 ); # I hope this helps, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
Removing duplicate fields with MARC::record
Hi Folks: Hmmm... I am not sure how my address got in the middle of the code in that last message. Here is a corrected version! Michael It is me again. I have another question... I am helping someone clean up her database. Somehow, the 856 field in her MARC records has duplicated itself several times. She has some records with 15+ duplicate 856 fields. So I am trying, unsuccessfully, to modify one of my scripts to delete the duplicate fields. The problem I am having is that out of the 15+ 856 fields, 3 of the 856 fields are unique to the record and need to stay. Here is my code so far: ## create a MARC::File::USMARC object use MARC::Batch; my $infile = shift; my $otfile = shift; my $batch = MARC::Batc ## if $file isn't defined we had trouble with the file ## so exit if (not($infile)) { print $MARC::File::ERROR,"\n"; exit(0); } while (my $record = $batch->next()) { my @m856 = $record->field('856'); @m856 = sort {$a cmp $b} @m856; my %seen = (); my @new856 = (); foreach ( $record->fields() ) { if (@m856) { foreach $f (@m856) { next if ($seen{ $f }++); push @new856, $f; $record->delete_field($f); } } } $record->insert_fields_ordered( @new856 ); print DBOUT $record->as_usmarc(); } When I run this script, It put ALL the 856 fields back in the record and they are not sort. What am I doing wrong? TIA! Michael Harrisburg Area Community College
Removing duplicate fields with MARC::record
Hi Folks: It is me again. I have another question... I am helping someone clean up her database. Somehow, the 856 field in her MARC records has duplicated itself several times. She has some records with 15+ duplicate 856 fields. So I am trying, unsuccessfully, to modify one of my scripts to delete the duplicate fields. The problem I am having is that out of the 15+ 856 fields, 3 of the 856 fields are unique to the record and need to stay. Here is my code so far: ## create a MARC::File::USMARC object use MARC::Batch; my $infile = shift; my $otfile = shift; my $batch = MARC::Batc Michael L. Bowden Coordinator of Automation and Access Services Associate Professor, Information Science Harrisburg Area Community College One HACC Drive Harrisburg, PA 17110-2999 E: [EMAIL PROTECTED] T: 717.780.1936 F: 717.780.2462h->new('USMARC',$infile); $batch->strict_off(); open (DBOUT, "> $otfile"); ## if $file isn't defined we had trouble with the file ## so exit if (not($infile)) { print $MARC::File::ERROR,"\n"; exit(0); } while (my $record = $batch->next()) { my @m856 = $record->field('856'); @m856 = sort {$a cmp $b} @m856; my %seen = (); my @new856 = (); foreach ( $record->fields() ) { if (@m856) { foreach $f (@m856) { next if ($seen{ $f }++); push @new856, $f; $record->delete_field($f); } } } $record->insert_fields_ordered( @new856 ); print DBOUT $record->as_usmarc(); } When I run this script, It put ALL the 856 fields back in the record and they are not sort. What am I doing wrong? TIA! Michael Harrisburg Area Community College