On Fri, Feb 15, 2008 at 4:04 PM,  <[EMAIL PROTECTED]> wrote:
> I've been trying to use regular expressions or some kind of counter
>  thing, but I can't seem to work this right. I have data in a text file
>  where the important thing I want to extract is between two blank
>  lines. That's the only systematic way to find the useful lines.
>
>  So, it'll go "blank line, important stuff, blank line". There are
>  plenty of other lines of text that may be have a blank line before or
>  after it. But only the important ones have both.
>
>  Is there a simple way to extract/print these lines? Thanks!
>

The problem, as I see it, it that once *some* things in a file are
surrounded by empty lines, then by definition *everything* in the file
is surrounded by newlines, and you haven't really given us any way to
distinguish between the good stuff and the bad. Consider:

__DATA__
bad

good

bad

good

bad
bad
bad

good


There is no regex capable of telling the difference between the good
and the bad above, without some more information. Is the "good" data
always single-line? Is the "bad" data always multiline? So far
everyone seems to be assuming that, but is it the case?

My gut reaction, here, is to break the file up into records delimited
by '\n\n', and then perform some further test on the chunks to decide
whether they're good or bad, e.g.:

{
    local $/ = "\n\n";
    while (<>) {
        chomp;
        next if /\n/; # or change to suit
        # else we have a "good" data
    }
}


HTH,

-- jay
--------------------------------------------------
This email and attachment(s): [  ] blogable; [ x ] ask first; [  ]
private and confidential

daggerquill [at] gmail [dot] com
http://www.tuaw.com  http://www.downloadsquad.com  http://www.engatiki.org

values of β will give rise to dom!

Reply via email to