Jack Daniels (Butch) am Donnerstag, 23. Februar 2006 10.30: > It's driving me bonkers and can't afford any more psychiatic bills. The > data is a saved .txt file when viewing from a website. The vendor will not > give us an actual file even though we payed a montly fee for use of the > database. I have around 5000 records that need to be converted to MARC > cataloging records. I need to either have the data from each heading on 1 > line or have the script extract each heading and all the subsequent lines. > > The script is only extracting the first line of the heading..
Yes, with every loop trough @lines, you overwrite your variables $title to $dewey. > I can only > have 1 blank line between each record which works in the script. If I right > click then import to excel when viewing the records at the website, each > heading is a continous string, which is what I need. I can then save as a > tab delimited file and the lines for each heading remian continuous, which > works. But we have ceased our subscription and I now only have saved .txt > files of the 5000 records. I can't figure out how and where to modify the > script to work on the files. I suppose I could spend a couple months > manually joining lines, but that really cuts into naptime. I don't know what MARC cataloging records are nor is my english enough good to understand what you exactly mean, and I don't know if the leading spaces on every line below are in the sample data, but It may help you to produce a CSV file from the data. So, you can adjust my script below or wait for undoubtly arriving better solutions: [...] > HERE IS THE SCRIPT > > open(MYINPUTFILE, "<1000chomp.txt"); # open for input > > my(@lines) = <MYINPUTFILE>; # read file into list > > > my $title; > my $series; > my $subjects; > my $physical; > my $synopsis; > my $producer; > my $copyrighted; > my $dewey; > for my $line (@lines) > { > > $line =~ /Title/ and $title = $line; > $line =~ /Title/ and print "=LDR 00000nam 2200000Ia 45e0\n","=245 > 00\$a",$line; > > $line =~ /Dewey/ and $dewey = $line; > $line =~ /Dewey/ and print "=082 \\\\\$a",$line; > > $line =~ /Producer/ and $producer = $line; > $line =~ /Producer/ and print "=040 \\\\\$aCaSRRI\n","=260 > \\\\\$a",$line; > > $line =~ /Copyrighted/ and $copyrighted = $line; > $line =~ /Copyrighted/ and print "=261 \\\\\$c",$line; > > $line =~ /Physical/ and $physical = $line; > $line =~ /Physical/ and print "=300 \\\\\$a1 videocassette ( min.) > > :\$bsd., col. ;\$c13 mm.",$line; > > $line =~ /Series/ and $series = $line; > $line =~ /Series/ and print "=440 0\\\$a",$line; > > $line =~ /Synopsis/ and $synopsis = $line; > $line =~ /Synopsis/ and print "=520 \\\\\$a",$line; > > $line =~ /Subjects/ and $subjects = $line; > $line =~ /Subjects/ and print "=550 \\\\\$a",$line,"\n"; ======================== #!/usr/bin/perl use strict; use warnings; local $/=""; # split data at 1..n empty lines # btw: Series does not occur in the sample data my $stops=qr/(?:Title)|(?:Physical)|(?:Copyrighted)|(?:Producer)|(?:Dewey)|(?:Synopsis)|(?:Subjects)|(?:Series)/; for my $record (<DATA>) { my @pairs=split (/($stops)/, $record); shift @pairs; # remove the undef 1st entry my [EMAIL PROTECTED]; $keyed{$_}=~s/\s+/ /gs for keys %keyed; # now you have one record as key/one-line-value pairs # for further processing, see: print join "\n", map {"$_=>$keyed{$_}"} keys %keyed; print "\n\n"; # you could sort it, produce a CSV-file, ... } __DATA__ Title 10 fastest growing careers: jobs for the future part four business and computer technology (03616) Physical Color; Sound; 15 minutes Copyrighted 1990 Producer GUIDANCE ASSOCIATES (GUID) Dewey 371.425 Synopsis Contents: The business community depends on up-to-the minute technology - technology that is changing rapidly. As a result, careers in technology, especially computers and specialized areas such as accounting are much in demand. Takes a look at three business and computer careers: software engineering, computer programming and accounting. Subjects CAREER GUIDANCE; CAREER SERVICES Holdings 1/2 VHS video: Head Office, 1 copy Title 10 fastest growing careers: jobs for the future part one legal and health (03613) Physical Color; Sound; 15 minutes Copyrighted 1990 Producer GUIDANCE ASSOCIATES (GUID) Dewey 371.425 Synopsis Contents: Takes a look at the fast growing health and legal fields. Talks to a registered nurse about her changing role in a major hospital, a physician's assistant who works with two doctors in a busy family practice, and a paralegal who works with an attorney. Subjects CAREER GUIDANCE; CAREER SERVICES Holdings 1/2 VHS video: Head Office, 1 copy ============= -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>