Here's something a bit simpler based on the original example Barry sent.
Basically looks for a single upper case letter with a single non-upper
case, non-white space char before it. \w doesn't do that, we also don't
need to use the "+" modifier since all we care about is matching a
single char. (Better performance if not searching for a variable length
string.)
perl -we 'my $t="madeStyle\nfacilitatedOne\nAnti-magneticQuality\n123FOO
BAR";
$t=~s/([^A-Z\s])([A-Z])/$1. $2/g;
print "----------\n$t\n";'
----------
made. Style
facilitated. One
Anti-magnetic. Quality
123. FOO BAR
Curtis
________________________________
From: [email protected]
[mailto:[email protected]] On Behalf Of
[email protected]
Sent: Friday, May 15, 2009 8:55 PM
To: [email protected]
Cc: [email protected]
Subject: Re: Help with Regular Expression
hi ari and barry --
In a message dated 5/15/2009 6:20:40 PM Eastern Standard Time,
[email protected] writes:
> On Fri, May 15, 2009 at 11:18 PM, Barry Brevik
<[email protected]> wrote:
>
> > I am running Active Perl 5.8.8.
> > ...
> > Difficulty: the fields contain hundreds of words both preceding and
> > following the "bad" words, so I have to be able to pick out the
> > lower-case words that contain one embedded upper-case character.
> > ...
> > Barry Brevik
>
> Hi Barry,
>
> Maybe something like this would help:
>
> $ cat test.txt
> madeStyle
> facilitatedOne
> Anti-magneticQuality
>
> $ cat test.txt |perl -pe 's/(\w+)([A-Z])/\1\. \2/g'
> made. Style
> facilitated. One
> Anti-magnetic. Quality
>
> Regards, Ari Constancio
the replacement string in a s/// should use capture variables rather
than backreferences; perl warns about this if warnings are on (always
a good idea). a '.' (period) character in a replacement string is not
a metacharacter and needs no escape.
also, the regex used, /(\w+)([A-Z])/, will allow any number greater than
zero of upper case letters, digits or underscores to precede the uc
letter
that is supposed to be the initial letter of a new sentence: probably
not
what is intended.
>cat test.txt
madeStyle
facilitatedOne
Anti-magneticQuality
123FOO
>cat test.txt | perl -wMstrict -pe
"s/(\w+)([A-Z])/\1\. \2/g"
\1 better written as $1 at -e line 1.
\2 better written as $2 at -e line 1.
made. Style
facilitated. One
Anti-magnetic. Quality
123FO. O
a better approach might be something like:
>cat test.txt | perl -wMstrict -pe
"s{ ([[:lower:]]) ([[:upper:]] [[:lower:]]) }{$1. $2}xmsg"
made. Style
facilitated. One
Anti-magnetic. Quality
123FOO
hth -- bill walters
**************
Recession-proof vacation ideas. Find free things to do in the U.S.
(http://travel.aol.com/travel-ideas/domestic/national-tourism-week?ncid=
emlcntustrav00000002)
_______________________________________________
ActivePerl mailing list
[email protected]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs