On 3/9/07, John W. Krahn <[EMAIL PROTECTED]> wrote:

Chas Owens wrote:
> On 3/8/07, Dharshana Eswaran <[EMAIL PROTECTED]> wrote:
>>
>> I need to extract few strings from one file and paste it to another
file.
> snip
>
> This doesn't seem like a good job for split.  The split function is
> good for parsing X separated records where X is either constant or
> simple.  What you have there a grammar.  Specifically a subset of the
> C grammar for #define.  With grammars you want to use either a regex
> or Parse::RecDescent depending on the complexity of the grammar.  In
> this case the grammar is simple enough that a regex does fine.  I have
> created a regex that parses your record into a name, base number,
> operator, and modifying number.  The last two values are optional.  I
> have used the x option on the regex to make it readable since it is so
> large (anything bigger than 80 characters should probably use the x
> option).  You can learn more about regexes in perldoc perlre and
> perldoc perlretut.
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
>
> while (<DATA>) {
>        my ($name, $base, $op, $mod) = m{
>                ^                     # start of string
>                \s*                   # optional spaces
>                \#define              # the start of the macro

The C preprocessor allows whitespace between '#' and 'define'.

                \# \s* define              # the start of the macro


>                \s+                   # mandatory spaces
>                (\w+)                 # capture the name of the macro
>                \s*                   # optional spaces

If the next token is a left parenthesis then the whitespace is not
optional
otherwise it would be a macro definition.  Also, if the next token is a
word
character then the whitespace is not optional.


>                \(                    # the open paren
>                \s*                   # optional spaces
>                (
>                        \w+ |
>                        \d+ |
>                        0x[a-fA-F0-9]

That only matches a single hexadecimal digit, you probably want
0x[a-fA-F0-9]+
instead.


>                )                     # capture a word, int, or hex
>                \s*                   # optional spaces
>                (?:
>                        (             # capture the various int operators
>                                [+-|^*/%] |

                                 [+-|^&*/%] |


>                                <<        |
>                                >>
>                        )

What about:

TOKEN & ~TOKEN

Or:

TOKEN * -TOKEN

:-)

>                        \s*           # optional spaces
>                        (             # capture a word, int, or hex
>                                \w+ |
>                                \d+ |
>                                0x[a-fA-F0-9])

                                 0x[a-fA-F0-9]+)


>                )?                    # but make the last two captures
> optional
>                \)                    # the close paren
>        }x;
>        $op = $mod = '' unless defined $op;
>        print;
>        print "\tname is $name, base is $base, modified by $mod using
$op\n"
>                if $name;
> }



John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and in short order.       -- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Thank you for the suggestions. I tried both ways.



 0x[a-fA-F0-9]

That only matches a single hexadecimal digit, you probably want
0x[a-fA-F0-9]+
instead.

When i tried Chas's idea, without this correction itself i could get the hex
values more than a digit.

But i did not understand the below lines mentioned by John.
What about:

TOKEN & ~TOKEN

Or:

TOKEN * -TOKEN

:-)

Can you please explain it?

Thanks once again for the immediate response.


Thanks and Regards,
Dharshana

Reply via email to