Hi Chris,
On Fri, May 25, 2012 at 11:53 PM, <[email protected]> wrote:
>
> May you try this matching?
>
> while(<DATA>) {
> next unless /^(\S+\s+)(\S+\s+)(\S+\s+).*\"(.*?)\"/;
> print "$1 $2 $3 $4\n";
> }
>
> HTH.
>
> ========================================
> Message Received: May 25 2012, 09:52 PM
> From: "Christopher Gray" <[email protected]>
> To: [email protected]
> Cc:
> Subject: Help required to extract multiple text fields from a text string
>
> Good day,
>
> I have a text file containing records. While I can extract single
> sub-strings, I cannot extract multiple sub-strings.
>
> The records are of multiple types - only about a third of which have the
> data I need.
>
> An example of a "good" record is
>
> Abc1234 STATUS open DESCRIPTION "A basket of melons" :: { fruittype 1}
>
> I'm trying to extract the first (Abc1234), second (open), third (A basket
> of
> melons) and fourth (1) strings.
>
> I can extract each of them separately - but not together.
>
> So - for example:
>
> while (<FILE>) {
> chomp;
> next if !/\{\s+fruittype\s+(\d+)\s+}/;
> my $Temp =$1;
> }
>
> Extracts the fruittype. However, when I try and have multiple extracts:
>
> ...
> next if !/\STATUS\s+(\w+)\s+\{\s+fruittype\s+(\d+)\s+}/;
> ...
> It fails.
>
> What have I done wrong?
>
> Chris
>
>
You didn't show the type of record that is incorrect, however, from your
code, I suppose a corrcet record would contain fruittype.
In the single match, your regex matches, but doesn't when trying to match
several.
while(<FILE>) {
next unless
m/(.+?)\s+?.+?\s.+?(.+?)\s+?.+?\"(.+?)\".+?\{\s+fruittype\s+(.+?)\}/;
print "$1 $2 $3 $4\n"; # print Abc1234 open A basket of melons 1
}
-- tim