Brilliant - a first class "teach-in". Many thanks for the description - it all works. Brilliant.
-----Original Message----- From: Rob Dixon [mailto:rob.di...@gmx.com] Sent: 26 May 2012 2:13 PM To: beginners@perl.org Cc: Christopher Gray Subject: Re: Help required to extract multiple text fields from a text string On 25/05/2012 21:51, Christopher Gray wrote: > Good day, > > I have a text file containing records. While I can extract single > sub-strings, I cannot extract multiple sub-strings. > > The records are of multiple types - only about a third of which have > the data I need. > > An example of a "good" record is > > Abc1234 STATUS open DESCRIPTION "A basket of melons" :: { fruittype 1} > > I'm trying to extract the first (Abc1234), second (open), third (A > basket of > melons) and fourth (1) strings. > > I can extract each of them separately - but not together. > > So - for example: > > while (<FILE>) { > chomp; > next if !/\{\s+fruittype\s+(\d+)\s+}/; > my $Temp =$1; > } > > Extracts the fruittype. However, when I try and have multiple extracts: > > ... > next if !/\STATUS\s+(\w+)\s+\{\s+fruittype\s+(\d+)\s+}/; > ... > It fails. > > What have I done wrong? Hi Chris Lets look at your regex /\STATUS\s+(\w+)\s+\{\s+fruittype\s+(\d+)\s+}/ First of all you start with a \S, which will actually match any non-space character, not the S that you intended. But that wouldn't break your regex. The regex as a whole is looking for 'STATUS' some whitespace some 'word' characters (0..9, A..Z, a..z and _) some whitespace an open brace '{' some whitespace 'fruittype' some whitespace some digits (0..9) some whitespace a closing brace (which should reall be escaped but Perl forgives you) This will match a string like '---XTATUS www { fruittype 999 }' But since "STATUS open " is followed by "DESCRIPTION" in your record instead of an opening brace the match fails. To match multiple fields, you can write a regex that matches the entire string with parentheses around the parts that must be captured. This will do what you want while (<DATA>) { my @data = /(\w+)\s+STATUS\s+(\w+)\s+DESCRIPTION\s+"([^"]+)"\s+::\s+\{\s*fruittype\s+(\ d+)\s*\}/; print "$_\n" for @data; } **output** Abc1234 open A basket of melons 1 but you may need to adjust the regex if the quoted string can be unquoted if is doesn't contain spaces. HTH, Rob -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/ -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/