RE: One last question

2002-09-18 Thread Kipp, James

> 
> I have a text file with a list of about 56,000 filenames. 
> Only the filenames are in this file.  I have another 30,000 
> or so .cfm and .htm files. I want to use File::Find to cycle 
> through EVERY file in EVERY directory line by line (about 2 
> million lines in all). Evertime it comes across a reference 
> to one of the 56,000 files I have in the list in the htm or 
> cfm file it needs to replace it with a lowercase version of 
> it. Not touching ANYTHING else.

not exactly sure what you want to do here. File::Find is the right way to
recurse files. you also want to take a look at grep (perldoc -f grep) .
also check the file::find docs closer, i don't see where you actually
provide a dir to file::find

I usually feed dirs to it something like this
wanted(\@paths);

sub wanted
{
do stuff .
}





> 
> CODE BELOW:
> #!/usr/bin/perl -w
> use strict;
> use File::Find;
> 
> sub process_files{
>  open($FH, "< $_") or die("Error! Couldn't open $_ for 
> reading!! Program aborting.\n");
>  open($MATCH, "< /home/losttre/match.txt") or die("Error! 
> Couldn't open $MATCH for reading!\n");
>  open($TEMP, "./temp.dat") or die ("Couldn't open temp file! 
> Aborting\n");
>  
>  @MATCH = ;
>  @fcontents = ;
>  
>  foreach $lineitem (@MATCH){
>   foreach $lineitem2 (@fcontents){
>if($lineitem == i/$lineitem2/){
> #I ASSUME THIS IS WHERE MY MATCH WOULD HAPPEN AND I 
> NEED TO REPLACE THE STRING
> }
> }
 


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: One last question

2002-09-17 Thread david

Anthony Saffer wrote:

> Hello AGAIN,
> 
> I have one final question that I think will set me free from this coding
> haze I've been in all day. Please look at the code below. Here is the idea
> I am trying to implement:
> 
> I have a text file with a list of about 56,000 filenames. Only the
> filenames are in this file.  I have another 30,000 or so .cfm and .htm
> files. I want to use File::Find to cycle through EVERY file in EVERY
> directory line by line (about 2 million lines in all). Evertime it comes
> across a reference to one of the 56,000 files I have in the list in the
> htm or cfm file it needs to replace it with a lowercase version of it. Not
> touching ANYTHING else.
> 
> I know it's going to take regular expressions. This is where I am totally
> lost. Could somone give me some hints. Please don't provide me with ready
> made code as I won't really learn that way. But an idea on what I need to
> do would be very helpful.
> 
> Thanks!
> Anthony
> 
> CODE BELOW:
> #!/usr/bin/perl -w
> use strict;
> use File::Find;
> 
> sub process_files{
>  open($FH, "< $_") or die("Error! Couldn't open $_ for reading!! Program
>  aborting.\n"); open($MATCH, "< /home/losttre/match.txt") or die("Error!
>  Couldn't open $MATCH for reading!\n"); open($TEMP, "./temp.dat") or die
>  ("Couldn't open temp file! Aborting\n");
>  
>  @MATCH = ;
>  @fcontents = ;
>  
>  foreach $lineitem (@MATCH){
>   foreach $lineitem2 (@fcontents){
>if($lineitem == i/$lineitem2/){
> #I ASSUME THIS IS WHERE MY MATCH WOULD HAPPEN AND I NEED TO
> #REPLACE THE STRING
> }
> }
> 
> NOTE: Yes, I am aware there are a lot of syntax and other problems with
> this code. I can probably correct those but I am totally lost on the
> matching.

searching a large array is time inefficient. you should consider using a 
hash instead. assume you have your 56,000 filenames in the 'master.txt' 
file and you want to search the '/searchable' directory (and all it's 
subdirectories) for a match:

#!/usr/bin/perl -w
use strict;
use File::Find;

my %master;

#-- first load the master.txt into a hash:
#--
open(MASTER,'master.txt') || die $!;
while(){
chomp;
$master{$_} = 1;
}
close(MASTER);

#-- now traverse the '/searhable' directory for a match
#--
find(\&process,'/searchable');

sub process{

#-- assume the filenames in master.txt is only relative
#-- if that's not the case, $Find::File::dir can help prefix $_
next unless(exists $master{$_});

#-- found a match:
#--
#-- $_: is the match filename
#-- $Find::File::dir is where $_ is resides in
#-- $Find::File::name is the full path
#--
#-- do whatever you want to do such as doing a rename()
#-- like you plan
}

__END__

this way, your script will spend most of it's time traversing the 
directories instead of finding matches within the directories.

david

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




One last question

2002-09-17 Thread Anthony Saffer

Hello AGAIN,

I have one final question that I think will set me free from this coding haze I've 
been in all day. Please look at the code below. Here is the idea I am trying to 
implement:

I have a text file with a list of about 56,000 filenames. Only the filenames are in 
this file.  I have another 30,000 or so .cfm and .htm files. I want to use File::Find 
to cycle through EVERY file in EVERY directory line by line (about 2 million lines in 
all). Evertime it comes across a reference to one of the 56,000 files I have in the 
list in the htm or cfm file it needs to replace it with a lowercase version of it. Not 
touching ANYTHING else.

I know it's going to take regular expressions. This is where I am totally lost. Could 
somone give me some hints. Please don't provide me with ready made code as I won't 
really learn that way. But an idea on what I need to do would be very helpful.

Thanks!
Anthony

CODE BELOW:
#!/usr/bin/perl -w
use strict;
use File::Find;

sub process_files{
 open($FH, "< $_") or die("Error! Couldn't open $_ for reading!! Program aborting.\n");
 open($MATCH, "< /home/losttre/match.txt") or die("Error! Couldn't open $MATCH for 
reading!\n");
 open($TEMP, "./temp.dat") or die ("Couldn't open temp file! Aborting\n");
 
 @MATCH = ;
 @fcontents = ;
 
 foreach $lineitem (@MATCH){
  foreach $lineitem2 (@fcontents){
   if($lineitem == i/$lineitem2/){
#I ASSUME THIS IS WHERE MY MATCH WOULD HAPPEN AND I NEED TO REPLACE THE STRING
}
}

NOTE: Yes, I am aware there are a lot of syntax and other problems with this code. I 
can probably correct those but I am totally lost on the matching.







One last question...

2001-06-21 Thread Jack Lauman

I get the following from 'grep CAD currency.csv' created from the
script below.  If the file has more than one e-mail message in it
(this one does) how can I get it to return the correct date along
with the currency rates data (which are correct)?

2001-06-14,14:16:23,PDT,CAD,Canada Dollars,0.657776,1.52027
2001-06-14,14:16:23,PDT,CAD,Canada Dollars,0.656214,1.52389
2001-06-14,14:16:23,PDT,CAD,Canada Dollars,0.656039,1.52430
2001-06-14,14:16:23,PDT,CAD,Canada Dollars,0.651900,1.53398
2001-06-14,14:16:23,PDT,CAD,Canada Dollars,0.652010,1.53372
2001-06-14,14:16:23,PDT,CAD,Canada Dollars,0.652225,1.53321

The above dates should fall between 6/14 and 6/20.

Thanks,

Jack

#!/usr/bin/perl
#
# cur2csv.pl
#

use strict;
use vars qw($started);
use vars qw($cur_sym $cur_desc $usd_unit $units_usd);
use vars qw($year $month $mday $hour $minute $second $timezone);
use vars qw($conv_date $date $time $tz);


use Date::Manip;
use String::Strip;

use DBI;
use DBD::Pg;

open (OUTFILE, ">", "currency.csv") || die "Can not open currency.csv
for writing";

printf STDERR "Reading currency file...";
open (INFILE, "/var/spool/mail/currency") || die "Can not open
/var/spool/mail/currency for reading";

while () {

# Extract date and time of Currency Rate Quotation

($year, $month, $mday, $hour, $minute, $second, $timezone) =
/^Rates as of (\d+).(\d+).(\d+) (\d+):(\d+):(\d+) (\w+) (.*)$/; 

# Convert date from UTC (GMT) to PST8PDT and adjust date and time
accordingly.

$tz = &Date_TimeZone;   
$conv_date = "$year-$month-$mday $hour:$minute:$second";
$conv_date = &ParseDate($conv_date);
$conv_date = &Date_ConvTZ($conv_date, $timezone, $tz);  
$date  = &UnixDate($conv_date,"%Y-%m-%d");
$time  = &UnixDate($conv_date,"%H:%M:%S");
$tz= &UnixDate($conv_date,"%Z");

$year and last;# If we've matched the data line, then bail out.

eof and print STDERR "Didn't find the date line";

}

# Extract the ISO 4217 Code for Currencies and Funds (1995)
# Extract the Currency Description, and trim the trailing spaces
# Extract US Dollars to Units rate, and trim the leading/trailing
spaces
# Extract Units to US Dollars rate, and trim the leading/trailing
spaces

while () {

($cur_sym, $cur_desc, $usd_unit, $units_usd) =
/^([A-Z]{3})\s+([A-Za-z()\s]{28})\s+(\d+\.\d+)\s+(\d+\.\d+)/;

# Strip the trailing spaces from $cur_desc
StripTSpace($cur_desc);

$cur_sym and $started++;

if ($cur_sym) {
printf OUTFILE "%s\,%s\,%s\,%s\,%s\,%s\,%s\n",
$date, $time, $tz, $cur_sym, $cur_desc, $usd_unit, $units_usd;
}

}   

$started or print STDERR "Didn't find a currency line";

close(INFILE);
close(OUTFILE);
print STDERR "\n";

1;