Here are two sample data files and the full script minus a bunch of comments that aren't needed..

Thanks,
Tim

Tim McGeary
Senior Library Systems Specialist
Lehigh University
610-758-4998
[EMAIL PROTECTED]



Charles K. Clarkson wrote:
From: Tim McGeary <mailto:[EMAIL PROTECTED]> wrote:

: Ok, here's the code that is pushing $cat_key, $title,
: $url, and $code into the @fund_array: : : @sirsi_array has a DB_key, a Title and "data" which
: is either a code or a URL, in such a manner:
: : 101|Journal of Nature|PBIO|
: 101|Journal of Nature|http://www.nature.org/|
: 102|Journal of Crap|PBIO|
: 102|Journal of Crap|http://www.crap.org/|
: 102|Journal of Crap|http://www.jstor.org/crap/|
: : Each of the above rows is what the @sirsi_array
: would have. Here's the pseudocode for the real code
: below:
: : Split array element.
: Check to see if we have a new DB_key
: If yes, check to see if code matches allowed codes.
: If yes, then set $code=$data and $titles=1.
: Else, $titles=0, push to @reject_array and move on.
: If no, set $title
: check for jstor or prola in $data
: If yes, append (archive) to $title
: set $url with $ezproxy prefix
: set $cat_key with simple algorithm based on # of titles
: push $cat_key, $title, $url, and $code to @fund_array : increment $titles : : Real code:
: : my $preCat = 0;
: my $titles = 0;
: my $title = '';
: my $code = '';
: my $url = '';
: my @fund_array;
: my @reject_array;



You have 'strict' turned off. Not a good idea. You'll need to present a working example or more code (all of it?). I added this, but there is still more missing.

my @sirsi_array = (
    '101|Journal of Nature|PBIO|',
    '101|Journal of Nature|http://www.nature.org/|',
    '102|Journal of Crap|PBIO|',
    '102|Journal of Crap|http://www.crap.org/|',
    '102|Journal of Crap|http://www.jstor.org/crap/|',
);


This part of the code doesn't look like it has an error, but with all those global variables running around, we can't be sure. We'll need to see more code to help you find the error you're getting.



HTH,

Charles K. Clarkson
PACT|11|
PART|6|
PBIO|12|
PCEM|13
PCHE|14|
PCIE|15|
PCOM|16|
PECO|17|
PEDN|8|
PEES|7|
PELE|18|
PENG|19|
PFIN|20|
PHIS|22|
PINE|21|
PINR|23|
PJRN|24|
PLAW|25|
PLIR|1|
PMAR|26|
PMAT|3|
PMEH|27|
PMLL|28|
PMSE|9|
PMUS|4|
PPHL|2|
PPHY|29|
PPOL|10|
PPSY|30|
PREL|31|
PSOC|32|
PTHE|33|
PUND|1|
PSTF|100|
#!/s/sirsi/Unicorn/Bin/perl

#############################################################
# etexts.pl
# Tim McGeary

use warnings;
use Time::Local;

my $fund_file = "codes.txt";
my $sirsi_file = "sample.txt";

my $fundcoded_file = "ej.title";    # only data that had a valid fund code
my $etexts = "etexts.data";         # data that will be imported into etext table
my $disEtexts = "disEtexts.data";   # mapping file of etexts to disciplines
my $cat_value = 10000000;
my $reject_file = "reject.title";

###### DATE ####
# getting date

my ($sec,$min,$hour,$mday,$mon,$year,
$wday,$yday,$isdst) = localtime time;

$mon += 1;
$year += 1900;
my $date = "$year-$mon-$mday";
###### END DATE ####

###### BEGIN read in fund.codes ######
# opens and reads in the fund code file #
open (FUND,"< $fund_file") or die "Cannot open $fund_file";
print "Opening $fund_file \n";

my %codes_hash;
while (<FUND>) {
   chomp;
   my ($code, $id) = split /\|/;
   $codes_hash{$code} = $id;
}

print "Closing $fund_file \n";
close (FUND);
###### END read in fund.codes ######

###### BEGIN read in Sirsi data ######
# reads in Sirsi data that is created by prtentry.  The format is the
# following:
# |cat_key|Title|Fund_Code or URL|
#
# Some records will have more than one URL so it will either be a
# two or three line record
#

# Therefore the cat_key will have to be compared line by line to see
# where the next record starts
#
# Also, an algorithm to keep unique DB_IDs will be 10000000+cat_key
#

open (SIRSI,$sirsi_file) or die "Cannot open $sirsi_file";
print "Opening $sirsi_file to push into array \n";

my @sirsi_array;

# push each line of this file to an array

while (<SIRSI>) {
   chomp;
   push(@sirsi_array, $_);
}
print "Closing $sirsi_file \n";
close (SIRSI);
###### END read in Sirsi data ######

###### BEGIN Separating Sirsi data ######
# pull out unique data from the prtentry data we have

my $preCat = 0;
my $titles = 0;
my $title = '';

my $code = '';
my $url = '';
my @fund_array;
my @reject_array;

my $ezproxy = "http://ezproxy";;  # snipped for security purposes

print "Sorting sirsi data into fundcoded information... \n";

foreach $item (@sirsi_array) {
   my ($temp_key, $temp_title, $data) = split (/\|/, $item);
   if ($preCat != $temp_key) {   # new record
      $titles = 0;
      for (keys %codes_hash) {
         if ($data =~ /$_/) {
            $found = 1;
            last;
         }
      }
      if ($found) {
         $titles = 1;
         $code = $data;
      }
      else {
         $titles = 0;
         push(@reject_array,"$temp_key\|$temp_title\|");
         next;
      }
      $preCat = $temp_key;
      $found = 0;
   }
   else {
      $title = $temp_title;
      if ($data =~ /jstor/) {
         $title = "$title(archive)";
      }
      elsif ($data =~ /prola/) {
         $title = "$title(archive)";
      }
      $url = "$ezproxy$data";
      $cat_key = $titles * $cat_value + $temp_key;
      push(@fund_array,"$cat_key\|$title\|$url\|$code\|");
      $titles = $titles + 1;
   }
}
##### END Separating Sirsi data #####

##### BEGIN saving only fund coded data ######
# save only data that has a fund code #
open (FILE,"> $fundcoded_file") or die "Cannot open $fundcoded_file";
print "Opening $fundcoded_file\n";
print "Saving data to $fundcoded_file\n";
foreach $coded (@fund_array) {
   printf (FILE "$coded\n");
}
print "Closing $fundcoded_file\n\n";
close (FILE);

# save reject sirsi data - those w/o fund codes
open (REJECT,"> $reject_file") or die "Cannot open $reject_file";
print "Opening $reject_file\n";
print "Saving data to $reject_file\n";
foreach $reject (@reject_array) {
   printf (REJECT "$reject\n");
}
print "Closing $reject_file\n\n";
close (REJECT);

###### END saving only fund coded data ######

###### Save tab delimited files for initial db load #######
open (ETEXTS, "> $etexts") or die "Cannot open $etexts";
open (DISETEXTS, "> $disEtexts") or die "Cannot open $disEtexts";
print "Organizing etexts and disEtexts data \n";

my $lcd = '0';
my $id = '';
foreach $item (@fund_array) {
   my ($cat_key, $title, $url, $code) = split (/\|/, $item);
   my $descript = $url;
   printf (ETEXTS "$cat_key\t$title\t$url\t$descript\t$code\t$date\t$lcd\n");
   for (keys %codes_hash) {
      if ($code eq $_) {
         $id = $codes_hash{$_};
         last;
      }
   }
   printf (DISETEXTS "$id\t$cat_key\n");
}
print "Closing $etexts and $disEtexts\n\n";
close (ETEXTS);
close (DISETEXTS);

671|Psychophysiology|PPSY|
671|Psychophysiology|http://www.blackwell-synergy.com/servlet/useragent?func=showIssues&code=psyp|
675|Notes and queries|PENG|
675|Notes and queries|http://www.bodley.ox.ac.uk/ilej/journals|
675|Notes and queries|http://www.ingenta.com/journals/browse/oup/notesj|
676|The Journal of general 
psychology|http://asa.lib.lehigh.edu/cgi-bin/pubid?Pub=14345|
681|Greece & Rome|PHIS|
681|Greece & Rome|http://www.jstor.org/journals/00173835.html|
681|Greece & Rome|http://www3.oup.co.uk/gromej/contents.html|
709|Ecology|PEES|
709|Ecology|http://www.jstor.org/journals/00129658.html|
709|Ecology|http://asa.lib.lehigh.edu/cgi-bin/pubid?Pub=23845|
813|Western folklore|http://asa.lib.lehigh.edu/cgi-bin/pubid?Pub=29025|
843|African affairs|PINR|
843|African affairs|http://www.jstor.org/journals/00019909.html|
843|African affairs|http://www3.oup.co.uk/afrafj|
845|American artist|PART|
845|American artist|http://asa.lib.lehigh.edu/cgi-bin/pubid?Pub=28020|
847|The American midland naturalist|PEES|
847|The American midland naturalist|http://asa.lib.lehigh.edu/cgi-bin/pubid?Pub=28452|
909|Journal of the American Academy of Religion|PREL|
909|Journal of the American Academy of 
Religion|http://asa.lib.lehigh.edu/cgi-bin/pubid?Pub=27977|
915|Slavic review|PINR|
915|Slavic review|http://www.jstor.org/journals/00376779.html|
930|The American Slavic and East European 
review|http://www.jstor.org/journals/10497544.html|
953|The China business review|PMAR|
953|The China business review|http://asa.lib.lehigh.edu/cgi-bin/pubid?Pub=17129|
965|The Atlantic monthly|http://moa.cit.cornell.edu/moa/browse.journals/atla.html|
1098|American journal of science|PEES|
1098|American journal of science|http://www.geology.yale.edu/~ajs/Regular.html|
1199|Biometrics|PMAT|
1199|Biometrics|http://www.jstor.org/journals/0006341X.html|
1199|Biometrics|http://www.blackwell-synergy.com/servlet/useragent?func=showIssues&code=biom|
1210|Biometrics bulletin|http://www.jstor.org/journals/00994987.html|
1343|Consumers' research magazine|PMAR|
1343|Consumers' research magazine|http://asa.lib.lehigh.edu/cgi-bin/pubid?Pub=28909|
1646|Industrial and engineering 
chemistry|http://pubs3.acs.org/acs/journals/toc.page?incoden=iecac0|

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Reply via email to