Lynn. Rickards wrote:
> Craig Cardimon wrote:
>
>>I need to read a large text file line by line until a certain tag is
>>found, say <TAG>. This tag will exist on a line by itself. Then I need
>>read in all subsequent lines, appending them to each other, until the
>>ending tag </TAG> is found, again on a line by itself.
>>
>>My logic, if you can call it that, looks like this:
>>
>>TAGAREA: while(<IN>)
>>{
>>
>> if(m/<TAG>/i .. m/<\/TAG>/i)
>> {
>> $tagarea = $tagarea . $_ ;
Not sure why you're cating them all together - each time you get here
you should have the entire tag content.
>> }
>>
>> next TAGAREA unless defined $tagarea;
>>}
This should work fine (note that I allowed for a tag that may have
additional elements after the tag and before the > :
while (<IN>) {
print "Content of TAG: $_" if /<TAG(?: |>)/i .. /<\/TAG>/i;
# if you want, you can cat it to a hash or var or do whatever
# instead of printing.
}
close IN;
>>I also have a variety of other screen-out tests I perform on $tagarea.
>>If that variable contains any no-no's, I move on to the next TAGAREA.
>>
>>That's the theory, anyway. The logic isn't functioning. Where am I going
>>wrong?
>>
>>-- Craig
>>
>
>
> There's always more than one way...here's what I do
> to pull delimited sections out of a file...
> Incrmenting and decrementing the $inTag semaphore
> should handle nested <TAGS>
>
> #### UNTESTED ####
> my $inTag = 0;
> my $tagNum = 0;
> my %tagContent;
> foreach my $line(<IN>)
> {
> ($line =~/<TAG>/) && do {
> $inTag++;
> $tagCount++;
> next;
> };
> next unless $inTag;
> ($line =~/\/<TAG>/) && do {
> $inTag--;
> next;
> };
> # Here we are inside a <TAG>
> $tagContent{$tagNum} .= $line;
> }
> # Done reading file...semaphore should be back to 0
> # Worth checking -
> print "There was a problem - tag value: $inTag\n" if($inTag);
>
> Now you should have a hash of your tagged sections. I hope.
I don't think this does any more than the one liner while.
_______________________________________________
ActivePerl mailing list
[email protected]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs