Will of Thornhenge wrote:

> I'm doing some work with mail headers that involves converting 
> timestamps to a standard format. The following regex works except for 
> one pesky trailing close parens.
> 
> Here's a sample of the data that causes problems:
> 
> ==sample data
> Date: Fri, 1 Aug 1997 08:10:16 -0700 (PDT)<br>
> ===
> 
> This is converted to a YYYYMMDD.hhmmss format in place, then the result 
> is fed to this regex:
> 
> ==code extract
> # handle YYYYMMDD.hhmmss +0530 (IST) and similar
> while (/\b
>          (                   # $1 to $old
>           (\d{8}\.\d{6})     # $2 to datestamp
>           \s+
>           ([-+]?\d\d\d\d)    # $3 to $timezone
>           ( \s+ [(]?         # $4 if there is an abbrev,
>            [A-Z]{2,5}        # like EST or (EST)
>            [)]? )?           # then just get rid of it
>          )
>         \b/x ) {

Drop the closing \b (I changed $4 [optional]):

         /\b
         (                      # $1 to $old
         (\d{8}\.\d{6})         # $2 to datestamp
         \s+
         ([-+]?\d{4})           # $3 to $timezone
         (\s+\([A-Z]{2,5}\))?   # $4 if an abbrev, like EST or (EST)
         )                      # then just get rid of it
        /x) {

>     my ($old, $d1, $z1, ) = ($1, $2, $3, );
>     if (exists $timeZones{$z1}) {
>        my $z2 = $timeZones{$z1};  # obtain the abbreviation
>        $z1 = $timeZones{$z2};     # then the numeric value for the abbrev
>        my $d2 = date2Epoch($d1) + 3600 * ($tz - $z1);
>        s/\Q$old\E/'_' . epoch2Date($d2) . ' ' . $tzabbrev/e;
>     }
>     else {
>        s/\Q$old\E/_$old/;   # just mark it unchanged
>     }
> }
> s/_(\d{8}\.\d{6})/$1/g;    # clean up markers
> return $_;
> ====
> 
> The output I'm getting is
> 
> ==converted sample
> Date: 19970801.071016 PST)<br>
> ====
> 
> The continued existence of that closing parens is the problem. It is not 
> being included in $1, which becomes $old. How can I force its inclusion 
> (and why is the regex not behaving greedily?)



-- 
  ,-/-  __      _  _         $Bill Luebkert    Mailto:[EMAIL PROTECTED]
 (_/   /  )    // //       DBE Collectibles    Mailto:[EMAIL PROTECTED]
  / ) /--<  o // //      Castle of Medieval Myth & Magic http://www.todbe.com/
-/-' /___/_<_</_</_    http://dbecoll.tripod.com/ (Free site for Perl/Lakers)

_______________________________________________
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to