--- Walnut <[EMAIL PROTECTED]> wrote:
> Suck the file into a single variable and:
> 
> $entirefile =~ s!\/\*.*?\*\/!!g;
>
> >I am also very new to Perl! I need to figure out how I could skip a
> >block of comments in a C header file. For example, if I have something
> >like the following:
> >
> >/* This is my block of comments.....blah
> >...blah.....................................................................and
> >lots more comments here........
> >and then even more here!.................... with my end of comments
> >indentifier on the next line!.........
> >
> >*/

Well, better late than never:

The substitution "s!\/\*.*?\*\/!!g", while looking fine on the surface, has two 
problems.

1.  It doesn't account for false positives, such as comments in strings that shouldn't 
be removed:

    somevar = "/* I'm not a comment! */";

Admittedly, this is an unusual situation and not one that I lie awake at night 
worrying about. 
However, the second situation is far more common.

2.  Oftimes, a programmer will comment out an entire chunk of code.  If that code has 
embedded
multi-line comments, the above substitution will also fail.

    /*
    multi /* bla */
    line
    comment
    */

Try the following snippet:

    undef $/;
    $_ = <DATA>;
    s!\/\*.*?\*\/!!g;
    print;

    __DATA__
    somevar = "/* I'm not a comment! */";

    This 
    /*
    multi /* bla */
    line
    comment
    */
     is a sentence.

That will print:

    somevar = "";

    This
    /*
    multi
    line
    comment
    */
     is a sentence.

Clearly, it's failing in both cases.  Unfortunately, some would say that the cure is 
worse than
the disease :)  Here's the regex (from Mastering Regular Expressions, with a minor typo
corrected):

###########################
s{
  # First, things we want to match and not throw away
   (
    [^"'/]+                               # other stuff
    |                                     #  - or -
    (?:"[^"\\]*(?:\\.[^"\\]*)*" [^"'/]*)+ # double-quoted string
    |                                     #  - or -
    (?:'[^'\\]*(?:\\.[^'\\]*)*' [^"'/]*)+ # single-quoted string
   )
   |                                      # or
   /(?:                                   # all comments start with a slash
     \*[^*]*\*+(?:[^/*]|[^*]*\*+)*/       # traditional C comments
     |                                    #  - or -
     /[^\n]+                              # C++ // -style comments
   )
}{$1}gsx;
###########################

Running that monstrosity in the above program results in:

    somevar = "/* I'm not a comment! */";

    This

     is a sentence.

Cheers,
Curtis Poe

=====
Senior Programmer
Onsite! Technology (http://www.onsitetech.com/)
"Ovid" on http://www.perlmonks.org/

__________________________________________________
Do You Yahoo!?
Make international calls for as low as $.04/minute with Yahoo! Messenger
http://phonecard.yahoo.com/

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to