Re: [SLUG] Perl Regular expression help

2010-07-27 Thread Martin Barry
Sorry to bring up an old thread but I just had to comment on this...

$quoted_author = Jamie Wilkinson ;
 
 Try:
 
 /pg=[^]*/
 
 match zero or more of the character class that is not an ampersand.

Except there is nothing stopping the variables being reordered, no? So you
may need to match a leading ? instead of .

You could get crazy and try to do this in a single regex but two stage is
clearer. e.g.

sed -e 's/pg=[^]*//g' -e 's/?pg=[^]*/?/'


cheers
Marty
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] Perl Regular expression help

2010-07-27 Thread Jamie Wilkinson
Call me crazy!

s/(|?)pg=[^]*/\1/

(correct escaping of  and ? left as an exercise for someone actually using
this :)

On 27 July 2010 08:03, Martin Barry ma...@supine.com wrote:

 Sorry to bring up an old thread but I just had to comment on this...

 $quoted_author = Jamie Wilkinson ;
 
  Try:
 
  /pg=[^]*/
 
  match zero or more of the character class that is not an ampersand.

 Except there is nothing stopping the variables being reordered, no? So you
 may need to match a leading ? instead of .

 You could get crazy and try to do this in a single regex but two stage is
 clearer. e.g.

 sed -e 's/pg=[^]*//g' -e 's/?pg=[^]*/?/'


 cheers
 Marty
 --
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] Perl Regular expression help

2010-07-27 Thread Lindsay Holmwood

On 28/07/2010, at 1:03, Martin Barry ma...@supine.com wrote:

You could get crazy and try to do this in a single regex but two  
stage is

clearer. e.g.

sed -e 's/pg=[^]*//g' -e 's/?pg=[^]*/?/'


Now you have 2 problems. 
--

SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Fwd: Re: [SLUG] Perl Regular expression help

2010-07-15 Thread Jon Jermey




Perhaps this is the time for me to ask: does anyone know a way for grep

or awk to extract from a text file any sequence of up to, say, six words
that begins and ends with an initially-capitalised word -- whether or
not it is part of a larger matching sequence?

So if the input text was:

Sally Lee Jones worked for the United Nations Support Team

the output would be

Sally Lee
Lee Jones
Sally Lee Jones
Jones worked for the United
Jones worked for the United Nations
United Nations
Nations Support
Support Team
United Nations Support Team

I don't particularly care if it takes one pass or several, and I can
clean up duplicates afterwards.

This is not a serious problem for me  -- it falls into the 'would be
nice to have' category -- but I've been puzzling over it off and on for
a while, and the mention of the word 'greedy' reminded me of it.

Jon.

On 14/07/10 18:06, Nick Andrew wrote:


 (aaa...)

 Where the stuff inside () is what's being matched. The matched part stops
 at the first   or the end of the string. It's greedy so it matches as long
 a string as possible.

 Nick.




--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] Perl Regular expression help

2010-07-14 Thread Nick Andrew
On Wed, Jul 14, 2010 at 01:27:13PM +1000, Peter Rundle wrote:
 I don't really understand how the [^] followed by the * works but it does.

any character which is not an ampersand repeated zero or more times.

So it matches

()
()
(a)
(a)
(aaa...)
(aaa...)

Where the stuff inside () is what's being matched. The matched part stops
at the first  or the end of the string. It's greedy so it matches as long
a string as possible.

Nick.
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] Perl Regular expression help

2010-07-14 Thread Lindsay Holmwood

On 14/07/2010, at 13:27, Peter Rundle pe...@aerodonetix.com.au wrote:


P.S I didn't understand Lindsay's question about doing the replace.  
I'm replacing the arg with nothing, I.E I just want to remove the  
pg= argument from the string.




Didn't know what you were replacing your match with, was just curious  
about how other people would solve this problem.


Lindsay 
--

SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] Perl Regular expression help

2010-07-14 Thread Peter Rundle

Thanks Nick, What I didn't know was that ^ inside brackets [] means not. I was still 
reading ^ as beginning of string.

That will be very very useful in future.

Thanks

Pete


Nick Andrew wrote:

On Wed, Jul 14, 2010 at 01:27:13PM +1000, Peter Rundle wrote:

I don't really understand how the [^] followed by the * works but it does.


any character which is not an ampersand repeated zero or more times.

So it matches

()
()
(a)
(a)
(aaa...)
(aaa...)

Where the stuff inside () is what's being matched. The matched part stops
at the first  or the end of the string. It's greedy so it matches as long
a string as possible.

Nick.

--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] Perl Regular expression help

2010-07-13 Thread Jamie Wilkinson
Try:

/pg=[^]*/

match zero or more of the character class that is not an ampersand.

On 13 July 2010 17:21, Peter Rundle pe...@aerodonetix.com.au wrote:

 Hi Sluggers,

 I'm sure some of you genii have a real quick solution to this.

 I'm trying to find and replace and argument in a url. The url is of the
 form

 pg=somethingarg=somethingelse


 I want to take out the pg=something but the arg= may or may not be
 there. How do I say match the pg=something up to but not including the next
  (which may or may not be there).

/pg=.*/

 But also I think  is a special char (no?) that means put the matched bit
 back, though is that only on the replace side? (my question relates
 strictly to the matching side).


 TIA's

 Pete


 --
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] Perl Regular expression help

2010-07-13 Thread Lindsay Holmwood
Now you've got the search, I'm curious how you are going to do the replace.

Is the Perlism to just use the substitute operator, or split on the
pattern, iterate through the array, and join again?

Lindsay

On 14 July 2010 10:30, Jamie Wilkinson j...@spacepants.org wrote:
 Try:

 /pg=[^]*/

 match zero or more of the character class that is not an ampersand.

 On 13 July 2010 17:21, Peter Rundle pe...@aerodonetix.com.au wrote:

 Hi Sluggers,

 I'm sure some of you genii have a real quick solution to this.

 I'm trying to find and replace and argument in a url. The url is of the
 form

 pg=somethingarg=somethingelse


 I want to take out the pg=something but the arg= may or may not be
 there. How do I say match the pg=something up to but not including the next
  (which may or may not be there).

        /pg=.*/

 But also I think  is a special char (no?) that means put the matched bit
 back, though is that only on the replace side? (my question relates
 strictly to the matching side).


 TIA's

 Pete


 --
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

 --
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html




-- 
w: http://holmwood.id.au/~lindsay/
t: @auxesis
--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] Perl Regular expression help

2010-07-13 Thread Jamie Wilkinson
I'd use a global search and replace command, if it were me, and I was using
sed: sed -ie 's/pg=[^]//g' lindsay.html

On 13 July 2010 18:13, Lindsay Holmwood lind...@holmwood.id.au wrote:

 Now you've got the search, I'm curious how you are going to do the replace.

 Is the Perlism to just use the substitute operator, or split on the
 pattern, iterate through the array, and join again?

 Lindsay

 On 14 July 2010 10:30, Jamie Wilkinson j...@spacepants.org wrote:
  Try:
 
  /pg=[^]*/
 
  match zero or more of the character class that is not an ampersand.
 
  On 13 July 2010 17:21, Peter Rundle pe...@aerodonetix.com.au wrote:
 
  Hi Sluggers,
 
  I'm sure some of you genii have a real quick solution to this.
 
  I'm trying to find and replace and argument in a url. The url is of the
  form
 
  pg=somethingarg=somethingelse
 
 
  I want to take out the pg=something but the arg= may or may not be
  there. How do I say match the pg=something up to but not including the
 next
   (which may or may not be there).
 
 /pg=.*/
 
  But also I think  is a special char (no?) that means put the matched
 bit
  back, though is that only on the replace side? (my question relates
  strictly to the matching side).
 
 
  TIA's
 
  Pete
 
 
  --
  SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
  Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
 
  --
  SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
  Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
 



 --
 w: http://holmwood.id.au/~lindsay/
 t: @auxesis
 --
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] Perl Regular expression help

2010-07-13 Thread Tony Sceats
how about using a slightly different approach with split

@input = split /\/;

$input[0] should now be pg=something, $input[1] will be the
args=somthingelse

so you can trivially match, modify and print this to your output, whether or
not it has extra arguments.






On Wed, Jul 14, 2010 at 11:24 AM, Jamie Wilkinson j...@spacepants.orgwrote:

 I'd use a global search and replace command, if it were me, and I was using
 sed: sed -ie 's/pg=[^]//g' lindsay.html

 On 13 July 2010 18:13, Lindsay Holmwood lind...@holmwood.id.au wrote:

  Now you've got the search, I'm curious how you are going to do the
 replace.
 
  Is the Perlism to just use the substitute operator, or split on the
  pattern, iterate through the array, and join again?
 
  Lindsay
 
  On 14 July 2010 10:30, Jamie Wilkinson j...@spacepants.org wrote:
   Try:
  
   /pg=[^]*/
  
   match zero or more of the character class that is not an ampersand.
  
   On 13 July 2010 17:21, Peter Rundle pe...@aerodonetix.com.au wrote:
  
   Hi Sluggers,
  
   I'm sure some of you genii have a real quick solution to this.
  
   I'm trying to find and replace and argument in a url. The url is of
 the
   form
  
   pg=somethingarg=somethingelse
  
  
   I want to take out the pg=something but the arg= may or may not be
   there. How do I say match the pg=something up to but not including
 the
  next
(which may or may not be there).
  
  /pg=.*/
  
   But also I think  is a special char (no?) that means put the matched
  bit
   back, though is that only on the replace side? (my question relates
   strictly to the matching side).
  
  
   TIA's
  
   Pete
  
  
   --
   SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
   Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
  
   --
   SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
   Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
  
 
 
 
  --
  w: http://holmwood.id.au/~lindsay/
  t: @auxesis
  --
  SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
  Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
 
 --
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


RE: [SLUG] Perl Regular expression help

2010-07-13 Thread Ken Foskey
   /pg=.*/

But also I think  is a special char (no?) that means put the matched bit
back, though is that only on the replace side? (my 
question relates strictly to the matching side).


Yes the ampersand is special,  it represents the complete matched string on
the replace.

s/pg=.*/\/

As pointed out the solution is not optimal,  if there is more than two
parameters it will consume them all.   It will also NOT remove a trailing
parameter because the second  is not there.

Ken

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] Perl Regular expression help

2010-07-13 Thread Jamie Wilkinson
I don't see the problem with my approach; the match will terminate when it
sees the second ampersand, without consuming it.

On 13 July 2010 19:01, Ken Foskey kfos...@tpg.com.au wrote:

/pg=.*/

 But also I think  is a special char (no?) that means put the matched bit
 back, though is that only on the replace side? (my
 question relates strictly to the matching side).


 Yes the ampersand is special,  it represents the complete matched string on
 the replace.

 s/pg=.*/\/

 As pointed out the solution is not optimal,  if there is more than two
 parameters it will consume them all.   It will also NOT remove a trailing
 parameter because the second  is not there.

 Ken

 --
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] Perl Regular expression help

2010-07-13 Thread Peter Rundle

Thanks Jamie (and others),

That works a treat.

I would have tried

/pg=*[^]/

which of course would have matched all ampersands up until the last taking out 
more than one argument.
I don't really understand how the [^] followed by the * works but it does.

Thanks

Pete

P.S I didn't understand Lindsay's question about doing the replace. I'm replacing the arg with nothing, I.E I just want to remove 
the pg= argument from the string.




Jamie Wilkinson wrote:

Try:

/pg=[^]*/

match zero or more of the character class that is not an ampersand.

--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html