Andy Schwarz wrote: > > Howdy! Hello,
> I am attempting to import some mailing list archives into lyris format using > a Perl script. I have all of the script working, except for the importing of > the dates. For some reason, the date field does not import correctly. > > Below is the data that I am trying to import and the script used. > > Any help or advice would be GREATLY appreciated!! > > Thank you! > Andy Schwarz > > The e-mail headers look like: > > Date: Mon, 31 Jan 2000 19:09:12 -0600 > Reply-To: Bob Jones <[EMAIL PROTECTED]> > Sender: ISWORLD Information Systems World > Network<[EMAIL PROTECTED]> > From: Bob Jones <[EMAIL PROTECTED]> > Subject: AMCIS 2000 Minitrack > > The Perl script: use warnings; use strict; > # find the date line > $DatePos = index($ThisMessage, "Date: "); If the "Date: " string is at the beginning of $ThisMessage which it appears to be in your example then the value of $DatePos will be set to 0 (zero). > $EndDatePos = index($ThisMessage, "\n", $DatePos); If $DatePos is zero then this will be the same as: $EndDatePos = index($ThisMessage, "\n"); But since the newline will be at the end of the line this is the same as: $EndDatePos = length($ThisMessage) - 1; Or you could just use chomp() to remove the newline and use length(). > # extract the date line from the header > if ($DatePos > 0) { > $Date = &Trim(substr($ThisMessage, $DatePos + 6, $EndDatePos - $DatePos - 6)); > $Date = lc($Date); If $DatePos is zero as I pointed out earlier this will not run. > # parse mail date format > $Date =~ /([0-9]+[0-9]?) > (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)((1?9?([8-9]+[0-9]+))|(2?0? > ([0-9]+[0-9]+)))/; > $Day = $1; > $Month = $months{$2}; > $Year = $3; You should _always_ test that the regular expression worked before using the numeric scalars. > $NewDate = $Month.'/'.$Day.'/'.$Year; > $StdDate = &UnformatDate($NewDate); Your sub UnformatDate can be replaced with an sprintf() my $StdDate = sprintf '%04d%02d%02d', $Year, $Month, $Day; > #print "Day: $Day Month: $Month Year: $Year $NewDate\n"; > if ($Month < 1) { > print "Unable to parse date: $Date\n"; > <STDIN>; > } > else { > $ThisAttribs{'Created'} = $StdDate; > } > > sub UnformatDate { > my $InDate = $_[0]; > if ($InDate =~ /(.*?)\/(.*?)\/(.*)/) { > my $tmpYear = "0019".$3; ^^^^ Why are you assuming a twentieth century date? > my $tmpMonth = "00".$1; > my $tmpDay = "00".$2; > my $ReturnDate = substr($tmpYear, length($tmpYear) - > 4).substr($tmpMonth, length($tmpMonth) - 2).substr($tmpDay, > length($tmpDay) - 2); > return $ReturnDate; > } > return; > } If you have Date::Manip installed then: use Date::Manip; my $date; if ( /^Date:\s+(.+)/ ) { $date = ParseDate( $1 ); } my $StdDate = substr( $date, 0, 6 ); my ( $Year, $Month, $Day ) = unpack( 'a4a2a2', $date ); my $NewDate = "$Month/$Day/$Year"; __END__ If you don't have Date::Manip then: my %months = qw(Jan 1 Feb 2 Mar 3 Apr 4 May 5 Jun 6 Jul 7 Aug 8 Sep 9 Oct 10 Nov 11 Dec 12); my $mon_re = join '|', keys %months; my ( $Year, $Month, $Day ); if ( /^Date:\s+ # must start with Date: (?:Sun|Mon|Tue|Wed|Thu|Fri|Sat),\s+ # match weekday without capture (\d+)\s+ # match day of month ($mon_re)\s+ # match month (\d+)\s+ # match year /x ) { $Day = $1; $Month = $months{ $2 }; $Year = $3 } my $NewDate = "$Month/$Day/$Year"; my $StdDate = sprintf '%04d%02d%02d', $Year, $Month, $Day; __END__ John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]