Dynamic pattern matching?

2005-01-18 Thread Dan Fish

I've got a data file that for the most part, the entries look like:  (The
last 3 columns are data points...)

LKG_535   P10X0.6 -2.00E-09   0.00E+00  amps -3.800E-13
-3.920E-12   -7.800E-13 
VT_GM L0.8H40 -1.15E+00  -7.50E-01  volts-1.104E+00
-1.111E+00   -1.110E+00  
IDSAT_5   Y0.8N20 -5.80E-03  -3.00E-03  amps -5.036E-06
-5.001E-06   -4.853E-06   
VT_GP P0.8X.6 -1.15E+00  -7.50E-01  volts-1.018E+00
-9.966E-01   -1.012E+00 
LOGU_II2.00.6  6.00E-03   1.00E-02  amps  8.992E-03
8.939E-038.903E-03 

which I match with the following:

# RE for a valid floating point number
$fp = qr/[+-]?\ *(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?/;

# Case for 3 data points
if $line =~
/(.{9})\s+(.{10})\s+.{4}\s+($fp)\s+($fp)\s+(.{8})\s+($fp)\s+($fp)\s+($fp)\s+
$/o) 
{
  $datapts = 3
  #Insert matched vars into Class::Struct array...
  ...
}

But optionally, and once in a while there might be a line that looks like:
(this case shows 3 extra columns [data points], but in reality there could
be 1,3 or 5 more columns)

HGYPG5M1_LG   OT   0.00E+00   2.00E-08  amps  1.000E-06
4.000E-112.000E-116.000E-114.000E-118.000E-11 

I know I can write an if() clause to match every possible case, but I'm
wondering if there is a more general approach that would allow me to
dynamically match a varying number of extra columns within a single
expression.

Thanks,
-Dan   

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 


RE: Dynamic pattern matching?

2005-01-18 Thread Moon, John
I've got a data file that for the most part, the entries look like:  (The
last 3 columns are data points...)

LKG_535   P10X0.6 -2.00E-09   0.00E+00  amps -3.800E-13
-3.920E-12   -7.800E-13 
VT_GM L0.8H40 -1.15E+00  -7.50E-01  volts-1.104E+00
-1.111E+00   -1.110E+00  
IDSAT_5   Y0.8N20 -5.80E-03  -3.00E-03  amps -5.036E-06
-5.001E-06   -4.853E-06   
VT_GP P0.8X.6 -1.15E+00  -7.50E-01  volts-1.018E+00
-9.966E-01   -1.012E+00 
LOGU_II2.00.6  6.00E-03   1.00E-02  amps  8.992E-03
8.939E-038.903E-03 

which I match with the following:

# RE for a valid floating point number
$fp = qr/[+-]?\ *(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?/;

# Case for 3 data points
if $line =~
/(.{9})\s+(.{10})\s+.{4}\s+($fp)\s+($fp)\s+(.{8})\s+($fp)\s+($fp)\s+($fp)\s+
$/o) 
{
  $datapts = 3
  #Insert matched vars into Class::Struct array...
  ...
}

But optionally, and once in a while there might be a line that looks like:
(this case shows 3 extra columns [data points], but in reality there could
be 1,3 or 5 more columns)

HGYPG5M1_LG   OT   0.00E+00   2.00E-08  amps  1.000E-06
4.000E-112.000E-116.000E-114.000E-118.000E-11 

I know I can write an if() clause to match every possible case, but I'm
wondering if there is a more general approach that would allow me to
dynamically match a varying number of extra columns within a single
expression.

Thanks,
-Dan   

Here a suggestion

#! /usr/local/bin/perl
open PNTS,"points.dat" or die "Open failed";
@jumps=(\&p8,\&p9,\&p10,\&p11,\&p12);
while () {
@points = split /\s+/;
print "entries =<", scalar(@points) - 8, ">\n";
&{$jumps[scalar(@points) - 8]}
if scalar(@points) > 7 && scalar(@points) < 13;
}
sub p8{print "8 values\n";}
sub p9{print "9 values\n";}
sub p10{print "10 values\n";}
sub p11{print "11 values\n";}
sub p12{print "12 values\n";}

I hope this gives you some ideas ...

jwm

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: Dynamic pattern matching?

2005-01-18 Thread Dave Gray
> I know I can write an if() clause to match every possible case, but I'm
> wondering if there is a more general approach that would allow me to
> dynamically match a varying number of extra columns within a single
> expression.

You could shove all the data points into one parenthetical group in
the first regex and then process them inside the if statement.
Something like:

my $line = 'HGYPG5M1_LG   OT   0.00E+00   2.00E-08  amps  1.000E-06
4.000E-112.000E-116.000E-114.000E-118.000E-11';
$line =~ s/[\r\n]/ /g; #line wrap

# ... is everything else that i'm too lazy to type ;)
if ($line =~ /^...(?:amps|volts)((?\s+[.0-9eE+-]+)+)$/) {
  # last backref contains all data points
  my (..., $datapoints) = (..., $6);
  my @datapoints = split //, $datapoints;
  # process based on (scalar @datapoints)
}

HTH,
Dave

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]