Re: Regular expression question: non-greedy matches

R. Joseph Newton Mon, 05 Apr 2004 14:31:52 -0700

Boris Shor wrote:

> Thanks for writing.


Hi Boris,

Please don't top-post.  It makes it very difficult to get the context of your
message.  Istead, post following the material to which you are responding.

> Your code works for this example but doesn't get exactly
> what I need. It's important to me to keep $1 and $2 separate because Yeas
> and Nays are paired together (these are votes on bills). But sometimes, you
> only have Yeas (eg, a unanimous vote).
>
> That is why I want to see:
>
> 123             <- from the first yea ($1)
> (nothing)       <- no nay! ($2)
> 456             <- from the second yea ($1)
> 789             <- from the second nay ($2)

Why would you want to do this?  It seems to me that this is taking information
and truning it into meaningless data.  A series of numbers piled on top of each
other doesn't really communicate much.  What do you want to get out of the
process as a whole?

>
>
> Hence why I put a ? After the (?:Nay (.*?)x) regexp;

That is a bit off.  I think we really need a sample of actual data to be able to
help you.  If the data is of a confidential nature, then you will have to do
meaningful substitutions for any matter that is not public.  Boilerplate
substitutions do not work.  So far, I have seen three different formats for your
sample string.  Each of them would call logically for a somewhat different
extraction approach.
.
My best advice would be not to do it all in one regex.  Regular expressions are
powerful tools, and amaxingly efficient given the demands placed on them, but
they get progressively less efficient as they increase in complexity.  If there
is any distinct marker that separates the items being voted on, I would strongly
recommend that you first split on this marker so that each vote ahs its own
element.

It is much better to have explicit 0's for the losing side in any unanimous
vote.  Undefined values only confuse issues.

> the idea being this can
> appear zero or one times. But if I do this, I get no matches on the 'nays'
> or $2.

That is a pretty strong indication that the single-regex approach is not the way
to go for this job.

Can you give us a little more information on what your are trying to accomplish
overall?  You get there much faster when you know your destination.

Joseph


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: Regular expression question: non-greedy matches

Reply via email to