Hi Rami

You've got it right about $1 and $2, and also about capturing matches to
variables. Your problem is that if your regex fails on one part of the match
then none of the variables will be set. So in your example most of your
matches don't give a PID number because the regex has failed to match
completely. I've made a couple of changes and it now seems to work for me.

/\s*(\d+)\s+ # OK
([0-9][0-9])?:? # was ([0-9]?[0-9]?): if you don't have the two digits, 
                # you won't have the trailing colon either, so that has
                # to be optional too.
                # To make it even shorter you could use (\d{2})?:?
                # \d is shorthand for [0-9] and {2} means two of.
([0-9][0-9]):[0-9][0-9]\s+ # OK
(\\_)?\s*\-? # was (\\?_?)\s+\-? if you're going to have either both of
             # \ and _ or neither then '?' is best off outside the brackets.
             # You need to specify 0 or more spaces (\s*) rather that 1
             # or more (\s+) - I'm not sure why but it's probably something
             # to do with how the regex engine works.
(\w+)\s+ # OK
(\w)/; # not sure exactly what you're trying to do here, at the moment it's
       # failing on the sh -c lines, because \w doesn't match the '-'. On
       # the other lines it'sjust getting the first letter of the word after

       # the command. Maybe you want (?:-(\w))? - This will get an optional
         # single letter which follows a - sign. Look up non-capturing
       # parentheses for an explanation of (?:...)

If you have problems like this, it's easiest if you break down the regex
over several lines (using the x flag), then you can remove bits easily to
see where the failure is happening.

Cheers

Mark C

> -----Original Message-----
> From: Rami Al-Kabra [mailto:[EMAIL PROTECTED]]
> Sent: 31 October 2001 16:33
> To: [EMAIL PROTECTED]
> Subject: Parsing lines from txt files.
> 
> 
> Hello,
> 
> I'm brand new to the world of Perl.  The topic I'm about to ask about
> might have been addressed before.  Sorry for the duplication, 
> if that's
> the case.
> 
> My understanding is that whatever is between the parens in a 
> regexp get
> stored in a variable ($1,$2,...).  And, if you do a 
> 
> ($var_name1,$var_name2) = regexp with parens around 
> interesting stuff to
> store,
> 
> then the interesting stuff inside the parens get stored in 
> the vars you
> specify.  Am I understanding incorrectly?
> 
> The last print statement in the code below outputs nothing after the
> "pid =".  What am I doing wrong?
> 
> Thanks,
> Rami
> _______________________________
> $test_ps_txt_file = "txt_file_name";
> 
> print "Opening test ps file  $test_ps_txt_file...\n";
> open (TEST_PS_FILE, $test_ps_txt_file) || die "Couldn't open
> $test_ps_txt_file: $!";
> 
> while (<TEST_PS_FILE>)
> {
>   print "Read line...\n";
>   chomp; # used to removed newline char.
>   print "$_\n";
> 
>   #parse a line and store.
>   print "Parse line...\n";
>   
> ($pid,$elapsed_hours,$elapsed_minutes,$char_combo,$command,$cmd_arg) =
> /\s*(\d+)\s+([0-9]?[0-9]?):([0-9][0-9]):[0-9][0-9]\s+(\\?_?)\s
> +\-?(\w+)\
> s+(\w)/;
> 
>   print "pid = $pid\n";
> }
> ___________________________________
> 
> Here are the contents of the txt file (just create a file and
> copy/paste):
> 
>  3847       00:26 -tcsh
>   884    03:29:36 perl test_meister.pl
>  1034    01:36:33  \_ sh -c . setup_sh && cd /somedir && perl doit.pl
>  1036    01:36:32      \_ perl doit.pl -w -a
>  1076    01:36:31          \_ sh -c perl run.pl -w >> some_log 2>&1
>  1077    01:36:31              \_ perl run.pl -w
>  1497    01:34:48                  \_ sh -c some_exe some_file_name >
> redirect_file_name 2>&1
>  1498    01:34:48                      \_ some_exe dstest_control.out
> 
> 

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to