Andy Schwarz wrote:
>
> Howdy!
Hello,
> I am attempting to import some mailing list archives into lyris format using
> a Perl script. I have all of the script working, except for the importing of
> the dates. For some reason, the date field does not import correctly.
>
> Below is the data that I am trying to import and the script used.
>
> Any help or advice would be GREATLY appreciated!!
>
> Thank you!
> Andy Schwarz
>
> The e-mail headers look like:
>
> Date: Mon, 31 Jan 2000 19:09:12 -0600
> Reply-To: Bob Jones <[EMAIL PROTECTED]>
> Sender: ISWORLD Information Systems World
> Network<[EMAIL PROTECTED]>
> From: Bob Jones <[EMAIL PROTECTED]>
> Subject: AMCIS 2000 Minitrack
>
> The Perl script:
use warnings;
use strict;
> # find the date line
> $DatePos = index($ThisMessage, "Date: ");
If the "Date: " string is at the beginning of $ThisMessage which it
appears to be in your example then the value of $DatePos will be set to
0 (zero).
> $EndDatePos = index($ThisMessage, "\n", $DatePos);
If $DatePos is zero then this will be the same as:
$EndDatePos = index($ThisMessage, "\n");
But since the newline will be at the end of the line this is the same
as:
$EndDatePos = length($ThisMessage) - 1;
Or you could just use chomp() to remove the newline and use length().
> # extract the date line from the header
> if ($DatePos > 0) {
> $Date = &Trim(substr($ThisMessage, $DatePos + 6, $EndDatePos - $DatePos - 6));
> $Date = lc($Date);
If $DatePos is zero as I pointed out earlier this will not run.
> # parse mail date format
> $Date =~ /([0-9]+[0-9]?)
> (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)((1?9?([8-9]+[0-9]+))|(2?0?
> ([0-9]+[0-9]+)))/;
> $Day = $1;
> $Month = $months{$2};
> $Year = $3;
You should _always_ test that the regular expression worked before using
the numeric scalars.
> $NewDate = $Month.'/'.$Day.'/'.$Year;
> $StdDate = &UnformatDate($NewDate);
Your sub UnformatDate can be replaced with an sprintf()
my $StdDate = sprintf '%04d%02d%02d', $Year, $Month, $Day;
> #print "Day: $Day Month: $Month Year: $Year $NewDate\n";
> if ($Month < 1) {
> print "Unable to parse date: $Date\n";
> <STDIN>;
> }
> else {
> $ThisAttribs{'Created'} = $StdDate;
> }
>
> sub UnformatDate {
> my $InDate = $_[0];
> if ($InDate =~ /(.*?)\/(.*?)\/(.*)/) {
> my $tmpYear = "0019".$3;
^^^^
Why are you assuming a twentieth century date?
> my $tmpMonth = "00".$1;
> my $tmpDay = "00".$2;
> my $ReturnDate = substr($tmpYear, length($tmpYear) -
> 4).substr($tmpMonth, length($tmpMonth) - 2).substr($tmpDay,
> length($tmpDay) - 2);
> return $ReturnDate;
> }
> return;
> }
If you have Date::Manip installed then:
use Date::Manip;
my $date;
if ( /^Date:\s+(.+)/ ) {
$date = ParseDate( $1 );
}
my $StdDate = substr( $date, 0, 6 );
my ( $Year, $Month, $Day ) = unpack( 'a4a2a2', $date );
my $NewDate = "$Month/$Day/$Year";
__END__
If you don't have Date::Manip then:
my %months = qw(Jan 1 Feb 2 Mar 3 Apr 4 May 5 Jun 6
Jul 7 Aug 8 Sep 9 Oct 10 Nov 11 Dec 12);
my $mon_re = join '|', keys %months;
my ( $Year, $Month, $Day );
if ( /^Date:\s+ # must start with Date:
(?:Sun|Mon|Tue|Wed|Thu|Fri|Sat),\s+
# match weekday without capture
(\d+)\s+ # match day of month
($mon_re)\s+ # match month
(\d+)\s+ # match year
/x ) {
$Day = $1;
$Month = $months{ $2 };
$Year = $3
}
my $NewDate = "$Month/$Day/$Year";
my $StdDate = sprintf '%04d%02d%02d', $Year, $Month, $Day;
__END__
John
--
use Perl;
program
fulfillment
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]