Andy Schwarz wrote:
> 
> Howdy!

Hello,

> I am attempting to import some mailing list archives into lyris format using
> a Perl script. I have all of the script working, except for the importing of
> the dates.  For some reason, the date field does not import correctly.
> 
> Below is the data that I am trying to import and the script used.
> 
> Any help or advice would be GREATLY appreciated!!
> 
> Thank you!
> Andy Schwarz
> 
> The e-mail headers look like:
> 
> Date:         Mon, 31 Jan 2000 19:09:12 -0600
> Reply-To:     Bob Jones <[EMAIL PROTECTED]>
> Sender:       ISWORLD Information Systems World
> Network<[EMAIL PROTECTED]>
> From:         Bob Jones <[EMAIL PROTECTED]>
> Subject:      AMCIS 2000 Minitrack
> 
> The Perl script:

use warnings;
use strict;

> # find the date line
> $DatePos = index($ThisMessage, "Date: ");

If the "Date: " string is at the beginning of $ThisMessage which it
appears to be in your example then the value of $DatePos will be set to
0 (zero).

> $EndDatePos = index($ThisMessage, "\n", $DatePos);

If $DatePos is zero then this will be the same as:

$EndDatePos = index($ThisMessage, "\n");

But since the newline will be at the end of the line this is the same
as:

$EndDatePos = length($ThisMessage) - 1;

Or you could just use chomp() to remove the newline and use length().

> # extract the date line from the header
> if ($DatePos > 0) {
>    $Date = &Trim(substr($ThisMessage, $DatePos + 6, $EndDatePos - $DatePos - 6));
>    $Date = lc($Date);

If $DatePos is zero as I pointed out earlier this will not run.


> # parse mail date format
> $Date =~ /([0-9]+[0-9]?)
> (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)((1?9?([8-9]+[0-9]+))|(2?0?
> ([0-9]+[0-9]+)))/;
> $Day = $1;
> $Month = $months{$2};
> $Year = $3;

You should _always_ test that the regular expression worked before using
the numeric scalars.

> $NewDate = $Month.'/'.$Day.'/'.$Year;
> $StdDate = &UnformatDate($NewDate);

Your sub UnformatDate can be replaced with an sprintf()

my $StdDate = sprintf '%04d%02d%02d', $Year, $Month, $Day;

> #print "Day: $Day  Month: $Month  Year: $Year  $NewDate\n";
> if ($Month < 1) {
>     print "Unable to parse date: $Date\n";
>     <STDIN>;
> }
> else {
>     $ThisAttribs{'Created'} = $StdDate;
> }
> 
> sub UnformatDate {
>     my $InDate = $_[0];
>     if ($InDate =~ /(.*?)\/(.*?)\/(.*)/) {
>         my $tmpYear = "0019".$3;
                         ^^^^
Why are you assuming a twentieth century date?

>         my $tmpMonth = "00".$1;
>         my $tmpDay = "00".$2;
>         my $ReturnDate = substr($tmpYear, length($tmpYear) -
> 4).substr($tmpMonth, length($tmpMonth) - 2).substr($tmpDay,
> length($tmpDay) - 2);
>         return $ReturnDate;
>     }
>     return;
> }


If you have Date::Manip installed then:

use Date::Manip;

my $date;
if ( /^Date:\s+(.+)/ ) {
    $date = ParseDate( $1 );
    }

my $StdDate = substr( $date, 0, 6 );
my ( $Year, $Month, $Day ) = unpack( 'a4a2a2', $date );
my $NewDate = "$Month/$Day/$Year";

__END__

If you don't have Date::Manip then:

my %months = qw(Jan 1 Feb 2 Mar 3 Apr  4 May  5 Jun  6
                Jul 7 Aug 8 Sep 9 Oct 10 Nov 11 Dec 12);
my $mon_re = join '|', keys %months;
my ( $Year, $Month, $Day );
if (   /^Date:\s+      # must start with Date:
        (?:Sun|Mon|Tue|Wed|Thu|Fri|Sat),\s+
                       # match weekday without capture
        (\d+)\s+       # match day of month
        ($mon_re)\s+   # match month
        (\d+)\s+       # match year
       /x ) {

    $Day   = $1;
    $Month = $months{ $2 };
    $Year  = $3
    }

my $NewDate = "$Month/$Day/$Year";
my $StdDate = sprintf '%04d%02d%02d', $Year, $Month, $Day;

__END__


John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to