Regex Help Please!

2002-01-10 Thread Gordon Brandt

I am trying to come up with a script to convert this output from RRDTool DUMP to a 
format which lends itself to import into Excel 97.  Unfortunately, I am just getting 
started with Perl and do not have a clear enough grasp of how to configure this so 
that it strips out the unwanted parts and formats it correctly.  I would like to be 
able to feed a file into this script, and then receive a comma delimited formatted 
file as output.

Can anyone point me in the right direction?  I have the O'reilly camel book, but when 
I read the section on Regex, I feel like an idiot! :(

Input file:
|

(misc header information I want to delete)

#This is how the data I want to pull out is formatted
!-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv NaN /vv 
NaN /v/row
!-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 -- rowv 
6.00e+001 /vv 6.90e+001 /v/row

|---

Output wanted is:
2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN
2002-01-08 09:40:00 Eastern Standard Time, 1010500800, 6.00e+001, 
6.90e+001

|--

Thanks in advance.

Gordon
-- 

___
1 cent a minute calls anywhere in the U.S.!

http://www.getpennytalk.com/cgi-bin/adforward.cgi?p_key=RG9853KJurl=http://www.getpennytalk.com



___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users



RE: Regex Help Please!

2002-01-10 Thread Wagner-David

Here is a simplistic approach. May want more edits, but is a starting place.

Placing the data for testing under DATA:

while ( DATA ) {
   chomp;
   next if ( /^\s*$/ );   # bypass blank lines
   if ( /^!--\s(\d+.+)\s\/\s(\d+)\s-- rowv (.+) \/vv (.+) \/v\/row/ ) {
  printf %-s, %-s, %-s, %-s\n, $1, $2, $3, $4;
}else {
  printf No hit on data:\n%-s\n, $_;
}

 }
__DATA__
!-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv NaN /vv 
NaN /v/row
!-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 -- rowv 
6.00e+001 /vv 6.90e+001 /v/row
^--- Script ends here
Output:

2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN
2002-01-08 09:40:00 Eastern Standard Time, 1010500800, 6.00e+001, 
6.90e+001

Wags ;)

-Original Message-
From: Gordon Brandt [mailto:[EMAIL PROTECTED]]
Sent: Thursday, January 10, 2002 10:17
To: [EMAIL PROTECTED]
Subject: Regex Help Please!


I am trying to come up with a script to convert this output from RRDTool DUMP to a 
format which lends itself to import into Excel 97.  Unfortunately, I am just getting 
started with Perl and do not have a clear enough grasp of how to configure this so 
that it strips out the unwanted parts and formats
it correctly.  I would like to be able to feed a file into this script, and then 
receive a comma delimited formatted file as output.

Can anyone point me in the right direction?  I have the O'reilly camel book, but when 
I read the section on Regex, I feel like an idiot! :(

Input file:
|

(misc header information I want to delete)

#This is how the data I want to pull out is formatted
!-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv NaN /vv 
NaN /v/row
!-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 -- rowv 
6.00e+001 /vv 6.90e+001 /v/row

|---

Output wanted is:
2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN
2002-01-08 09:40:00 Eastern Standard Time, 1010500800, 6.00e+001, 
6.90e+001

|--

Thanks in advance.

Gordon
-- 

___
1 cent a minute calls anywhere in the U.S.!

http://www.getpennytalk.com/cgi-bin/adforward.cgi?p_key=RG9853KJurl=http://www.getpennytalk.com



___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users



RE: Regex Help Please!

2002-01-10 Thread Ron Hartikka

Works but not if you have more or fewer than 2 values in a row.
Do you?


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]]On Behalf Of
 Wagner-David
 Sent: Thursday, January 10, 2002 1:31 PM
 To: 'Gordon Brandt'; [EMAIL PROTECTED]
 Subject: RE: Regex Help Please!


   Here is a simplistic approach. May want more edits, but is
 a starting place.

   Placing the data for testing under DATA:

 while ( DATA ) {
chomp;
next if ( /^\s*$/ );   # bypass blank lines
if ( /^!--\s(\d+.+)\s\/\s(\d+)\s-- rowv (.+) \/vv
 (.+) \/v\/row/ ) {
   printf %-s, %-s, %-s, %-s\n, $1, $2, $3, $4;
 }else {
   printf No hit on data:\n%-s\n, $_;
 }

  }
 __DATA__
 !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 --
 rowv NaN /vv NaN /v/row
 !-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 --
 rowv 6.00e+001 /vv 6.90e+001 /v/row
 ^--- Script ends here
 Output:

 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN
 2002-01-08 09:40:00 Eastern Standard Time, 1010500800,
 6.00e+001, 6.90e+001

   Wags ;)

 -Original Message-
 From: Gordon Brandt [mailto:[EMAIL PROTECTED]]
 Sent: Thursday, January 10, 2002 10:17
 To: [EMAIL PROTECTED]
 Subject: Regex Help Please!


 I am trying to come up with a script to convert this output from
 RRDTool DUMP to a format which lends itself to import into Excel
 97.  Unfortunately, I am just getting started with Perl and do
 not have a clear enough grasp of how to configure this so that it
 strips out the unwanted parts and formats
 it correctly.  I would like to be able to feed a file into this
 script, and then receive a comma delimited formatted file as output.

 Can anyone point me in the right direction?  I have the O'reilly
 camel book, but when I read the section on Regex, I feel like an idiot! :(

 Input file:
 |

 (misc header information I want to delete)

 #This is how the data I want to pull out is formatted
 !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 --
 rowv NaN /vv NaN /v/row
 !-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 --
 rowv 6.00e+001 /vv 6.90e+001 /v/row

 |---

 Output wanted is:
 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN
 2002-01-08 09:40:00 Eastern Standard Time, 1010500800,
 6.00e+001, 6.90e+001

 |--

 Thanks in advance.

 Gordon
 --

 ___
 1 cent a minute calls anywhere in the U.S.!

 http://www.getpennytalk.com/cgi-bin/adforward.cgi?p_key=RG9853KJu
rl=http://www.getpennytalk.com



___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users



___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users



RE: Regex Help Please!

2002-01-10 Thread Wagner-David

I worked from the data you provided. What can the data really look like?  
Provide some other and will make mod to handle(hopefully).

Wags ;)

-Original Message-
From: Ron Hartikka [mailto:[EMAIL PROTECTED]]
Sent: Thursday, January 10, 2002 10:49
To: 'Gordon Brandt'; [EMAIL PROTECTED]
Subject: RE: Regex Help Please!


Works but not if you have more or fewer than 2 values in a row.
Do you?


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]]On Behalf Of
 Wagner-David
 Sent: Thursday, January 10, 2002 1:31 PM
 To: 'Gordon Brandt'; [EMAIL PROTECTED]
 Subject: RE: Regex Help Please!


   Here is a simplistic approach. May want more edits, but is
 a starting place.

   Placing the data for testing under DATA:

 while ( DATA ) {
chomp;
next if ( /^\s*$/ );   # bypass blank lines
if ( /^!--\s(\d+.+)\s\/\s(\d+)\s-- rowv (.+) \/vv
 (.+) \/v\/row/ ) {
   printf %-s, %-s, %-s, %-s\n, $1, $2, $3, $4;
 }else {
   printf No hit on data:\n%-s\n, $_;
 }

  }
 __DATA__
 !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 --
 rowv NaN /vv NaN /v/row
 !-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 --
 rowv 6.00e+001 /vv 6.90e+001 /v/row
 ^--- Script ends here
 Output:

 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN
 2002-01-08 09:40:00 Eastern Standard Time, 1010500800,
 6.00e+001, 6.90e+001

   Wags ;)

 -Original Message-
 From: Gordon Brandt [mailto:[EMAIL PROTECTED]]
 Sent: Thursday, January 10, 2002 10:17
 To: [EMAIL PROTECTED]
 Subject: Regex Help Please!


 I am trying to come up with a script to convert this output from
 RRDTool DUMP to a format which lends itself to import into Excel
 97.  Unfortunately, I am just getting started with Perl and do
 not have a clear enough grasp of how to configure this so that it
 strips out the unwanted parts and formats
 it correctly.  I would like to be able to feed a file into this
 script, and then receive a comma delimited formatted file as output.

 Can anyone point me in the right direction?  I have the O'reilly
 camel book, but when I read the section on Regex, I feel like an idiot! :(

 Input file:
 |

 (misc header information I want to delete)

 #This is how the data I want to pull out is formatted
 !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 --
 rowv NaN /vv NaN /v/row
 !-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 --
 rowv 6.00e+001 /vv 6.90e+001 /v/row

 |---

 Output wanted is:
 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN
 2002-01-08 09:40:00 Eastern Standard Time, 1010500800,
 6.00e+001, 6.90e+001

 |--

 Thanks in advance.

 Gordon
 --

 ___
 1 cent a minute calls anywhere in the U.S.!

 http://www.getpennytalk.com/cgi-bin/adforward.cgi?p_key=RG9853KJu
rl=http://www.getpennytalk.com



___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users



___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users



RE: Regex Help Please!

2002-01-10 Thread Brown, Aaron D

:: -Original Message-
:: From: Gordon Brandt [mailto:[EMAIL PROTECTED]]
:: Sent: Thursday, January 10, 2002 12:17 PM
:: To: [EMAIL PROTECTED]
:: Subject: Regex Help Please!
:: 
:: [-snip-]
:: 
:: Input file:
:: |
:: 
:: (misc header information I want to delete)
:: 
:: #This is how the data I want to pull out is formatted
:: !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 
:: -- rowv NaN /vv NaN /v/row
:: !-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 
:: -- rowv 6.00e+001 /vv 6.90e+001 /v/row
:: 
:: |---
:: 
:: Output wanted is:
:: 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN
:: 2002-01-08 09:40:00 Eastern Standard Time, 1010500800, 
:: 6.00e+001, 6.90e+001
:: 
:: |--


The people on this list like nothing more, it seems, than chewing on a
regular expression puzzle, so you've come to the right place.  However,
you'll get better results out of your request if you can fill in some more
information about the input source.  Regular expressions get more
complicated when they have to deal with more variable/generic data forms,
but they're relatively easy if you build one for a single specific case.
Programmers in general will usually go for the least amount of complexity in
a solution, while at the same time being able to handle all possible
scenarios.  So to get good regex help, we all need to understand what
variables there are in the scenarios.  Some issues that come to mind
regarding the input data in your problem are:

1) Is the data guaranteed to contain one record per line?  Can data ever
spread to 2 or more lines?
2) Inside the v.../v tags are values (or not).  Are there ALWAYS EXACTLY
TWO v/v groups?
3) Can the data ever contain quotation marks or commas?  This is important
to know, when outputting to CSV.
4) Are there any other ways that the input data may vary from EXACTLY the
format you've presented in your sample?

 - Aaron

--
Aaron Brown  -  [EMAIL PROTECTED]
Middleware Programmer
University of Kansas
785-864-0423
http://www.ku.edu/~aaronb/
 
___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users



Re: Regex Help Please!

2002-01-10 Thread dolljunkie

A less elegant (perhaps) solution, but effective, no matter how many rows /
values:

while() {

s/\r//g; # I hate that carriage return
chomp;

next if(!/^.*\!--/); # skip non-matching lines

my @values;

my $ts = $1 if(s/\!--\s*(.*?)\s*--//);
my($ts1,$ts2) = split(/\s*\/\s*/,$ts);

while(s/row(.*?)\/row(.*)/$2/g) {
my $row = $1;
while($row =~ s/v\s*(.*?)\s*\/v(.*)/$2/g) {
push(@values,$1);
}
}

my $val_str = join(', ',@values);
print($ts1, $ts2, $val_str\n);
}

on input:
!-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv NaN
/vv NaN /v/row
!-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv 59
/vv 6000 /vv   700/v/row

returns:
2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN
2002-01-08 09:35:00 Eastern Standard Time, 1010500500, 59, 6000, 700


___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users