subject:"RE\: Regex help"

RE: Regex help

2004-09-13 Thread Jim

  
> Hi all,

Hi

> 
> create a filename.
> 
> I firstly need to remove any invalid characters (including spaces)

What do you consider invalid chars? Do you want just alpha cars? 

> 
> $filename = 'News & events';
> $filename =~ 
> s/[\w\&%'[EMAIL PROTECTED](\)&_\\+,\.=\[\]]//g; 
> 
> then convert it to lowercase.

$filename =~ s/[^A-Za-z]//g; #use only alpha chars
print lc($filename); # convert to lower case

# or if you want to strip out non printing (control) chars:
$filename = " News \r \n \x01 even ts";
$filename =~ s/[[:cntrl:]]//g;
$filename =~ s/\s//g;
print lc($filename);

That help at all?
Jim







---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.745 / Virus Database: 497 - Release Date: 8/27/2004
 


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help

2004-09-13 Thread Johnstone, Colin

Hi Jim,

rather than re-invent the wheel I would prefer if you could fix this regex
I believe it covers all invalid characters one would encounter

s/[\w\&%'[EMAIL PROTECTED](\)&_\\+,\.=\[\]]//g;

I would then use it as a general purpose regex for validating filenames.

Regards
Colin

  
> Hi all,

Hi

> 
> create a filename.
> 
> I firstly need to remove any invalid characters (including spaces)

What do you consider invalid chars? Do you want just alpha cars? 

> 
> $filename = 'News & events';
> $filename =~ 
> s/[\w\&%'[EMAIL PROTECTED](\)&_\\+,\.=\[\]]//g; 
> 
> then convert it to lowercase.

$filename =~ s/[^A-Za-z]//g; #use only alpha chars
print lc($filename); # convert to lower case

# or if you want to strip out non printing (control) chars:
$filename = " News \r \n \x01 even ts";
$filename =~ s/[[:cntrl:]]//g;
$filename =~ s/\s//g;
print lc($filename);

That help at all?
Jim







---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.745 / Virus Database: 497 - Release Date: 8/27/2004
 


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




This E-Mail is intended only for the addressee. Its use is limited to that
intended by the author at the time and it is not to be distributed without the
author's consent. Unless otherwise stated, the State of Queensland accepts no
liability for the contents of this E-Mail except where subsequently confirmed in
writing. The opinions expressed in this E-Mail are those of the author and do
not necessarily represent the views of the State of Queensland. This E-Mail is
confidential and may be subject to a claim of legal privilege.

If you have received this E-Mail in error, please notify the author and delete this 
message immediately.


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help

2004-09-13 Thread Raymond Raj

Hi,

you should escape "/" char also, here is the problem  must be <\/sub>
still this regular expression will not give what you expecting result, it
remove all alphanumeric char also..

for help try "perldoc perlre"  or wait some time some one give good guide..

regards
Raymond

-Original Message-
From: Johnstone, Colin [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 14, 2004 10:43 AM
To: Jim
Cc: [EMAIL PROTECTED]
Subject: RE: Regex help


Hi Jim,

rather than re-invent the wheel I would prefer if you could fix this regex
I believe it covers all invalid characters one would encounter

s/[\w\&%'[EMAIL PROTECTED](\)&_\\+,\.=\[\]]//g;

I would then use it as a general purpose regex for validating filenames.

Regards
Colin


> Hi all,

Hi

>
> create a filename.
>
> I firstly need to remove any invalid characters (including spaces)

What do you consider invalid chars? Do you want just alpha cars?

>
> $filename = 'News & events';
> $filename =~
> s/[\w\&%'[EMAIL PROTECTED](\)&_\\+,\.=\[\]]//g;
>
> then convert it to lowercase.

$filename =~ s/[^A-Za-z]//g; #use only alpha chars
print lc($filename); # convert to lower case

# or if you want to strip out non printing (control) chars:
$filename = " News \r \n \x01 even ts";
$filename =~ s/[[:cntrl:]]//g;
$filename =~ s/\s//g;
print lc($filename);

That help at all?
Jim







---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.745 / Virus Database: 497 - Release Date: 8/27/2004



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>




This E-Mail is intended only for the addressee. Its use is limited to that
intended by the author at the time and it is not to be distributed without
the
author's consent. Unless otherwise stated, the State of Queensland accepts
no
liability for the contents of this E-Mail except where subsequently
confirmed in
writing. The opinions expressed in this E-Mail are those of the author and
do
not necessarily represent the views of the State of Queensland. This E-Mail
is
confidential and may be subject to a claim of legal privilege.

If you have received this E-Mail in error, please notify the author and
delete this message immediately.


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>




-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

RE: Regex help

2004-09-13 Thread Jim

 
> 
> Hi Jim,
> 
> rather than re-invent the wheel I would prefer if you could 
> fix this regex I believe it covers all invalid characters one 
> would encounter
> 
> s/[\w\&%'[EMAIL PROTECTED](\)&_\\+,\.=\[\]]//g;
> 
> I would then use it as a general purpose regex for validating 
> filenames.

Please tell us what your are trying to do. without knowing more, this regex
does not make any sense to me

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.745 / Virus Database: 497 - Release Date: 8/27/2004
 


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help

2004-09-13 Thread Johnstone, Colin

Hi Jim,

In our cms 'Teamsite' our users will create a primary navigation menu using one of the 
templates

In the template I require them to enter the Anchor text and url for each of the 
primary menu items

for the purposes of this example I will provide only the Anchor text below

e.g Home, Enrolments, Courses, Getting Here, News & Events, About Us

As well as providing the primary menu above they then will provide a secondary menu.

In the secondary menu template they have to associate the menu they are building with 
one of the primary menu items above they do this by choosing from a drop down list. 
This list is derived by a perl script I have written that parses the primary nav menu 
xml record and returns a drop down list of options.

the label being the primary menu items as above 
the value will be the filename generated by the regex that we are trying to write.

So in short the regex must remove any characters from the anchor text e.g the & in 
'News & events'
and any spaces, basically any characters that cannot be used in a filename if they 
have been inadvertently entered by the users.

Any help appreciated thank you
Colin






-Original Message-
From: Jim [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 14, 2004 3:32 PM
To: Johnstone, Colin
Cc: [EMAIL PROTECTED]
Subject: RE: Regex help


 
> 
> Hi Jim,
> 
> rather than re-invent the wheel I would prefer if you could 
> fix this regex I believe it covers all invalid characters one 
> would encounter
> 
> s/[\w\&%'[EMAIL PROTECTED](\)&_\\+,\.=\[\]]//g;
> 
> I would then use it as a general purpose regex for validating 
> filenames.

Please tell us what your are trying to do. without knowing more, this regex
does not make any sense to me

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.745 / Virus Database: 497 - Release Date: 8/27/2004
 



This E-Mail is intended only for the addressee. Its use is limited to that
intended by the author at the time and it is not to be distributed without the
author's consent. Unless otherwise stated, the State of Queensland accepts no
liability for the contents of this E-Mail except where subsequently confirmed in
writing. The opinions expressed in this E-Mail are those of the author and do
not necessarily represent the views of the State of Queensland. This E-Mail is
confidential and may be subject to a claim of legal privilege.

If you have received this E-Mail in error, please notify the author and delete this 
message immediately.


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: Regex help

2004-09-13 Thread John W. Krahn

Johnstone, Colin wrote:
Hi Jim,
Hello, (my name is John BTW)

rather than re-invent the wheel I would prefer if you could fix this regex
I believe it covers all invalid characters one would encounter
s/[\w\&%'[EMAIL PROTECTED](\)&_\\+,\.=\[\]]//g;
Most of those characters are valid in DOS file names and all except '/' are 
valid in Unix file names.  How do you define invalid?
You have '<', '>' and '_' listed twice and 's', 'u', 'a' and 'm' listed three 
times and 'p' listed five times because?
You are using '/' to delimit the regexp and string yet you have an unescaped 
'/' in the character class which is an error and will not compile.

Maybe it would be better if you tell us what characters you consider to be 
*VALID*.  :-)

John
--
use Perl;
program
fulfillment
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help

2004-09-13 Thread Johnstone, Colin

Hi John,

I found this regex by searching on google and I assumed the guy who wrote it knew more 
than me.

I am creating unix filenames on this project but Teamsite can also be run under a 
windows environment.

I guess I was looking for a general purpose regex to remove invalid characters from 
filenames to add to my developers toolkit to take from project to project.

>From the menu item entered into the primary menu template I need to generate a 
>filename.

Specifically for this project I would like to remove spaces and ampersands 
apostrophes, quotes should users enter them.

Regards
Colin

-Original Message-
From: John W. Krahn [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 14, 2004 4:06 PM
To: Perl Beginners
Subject: Re: Regex help

Johnstone, Colin wrote:
> Hi Jim,

Hello, (my name is John BTW)

> rather than re-invent the wheel I would prefer if you could fix this regex
> I believe it covers all invalid characters one would encounter
> 
> s/[\w\&%'[EMAIL PROTECTED](\)&_\\+,\.=\[\]]//g;

Most of those characters are valid in DOS file names and all except '/' are valid in 
Unix file names.  How do you define invalid?
You have '<', '>' and '_' listed twice and 's', 'u', 'a' and 'm' listed three 
times and 'p' listed five times because?
You are using '/' to delimit the regexp and string yet you have an unescaped 
'/' in the character class which is an error and will not compile.

Maybe it would be better if you tell us what characters you consider to be *VALID*.  
:-)

John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

This E-Mail is intended only for the addressee. Its use is limited to that
intended by the author at the time and it is not to be distributed without the
author's consent. Unless otherwise stated, the State of Queensland accepts no
liability for the contents of this E-Mail except where subsequently confirmed in
writing. The opinions expressed in this E-Mail are those of the author and do
not necessarily represent the views of the State of Queensland. This E-Mail is
confidential and may be subject to a claim of legal privilege.

If you have received this E-Mail in error, please notify the author and delete this 
message immediately.

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: Regex help

2004-09-14 Thread Gunnar Hjalmarsson

[ Please type your reply below the quoted part of the message you
reply to. ]
Colin Johnstone wrote:
Jim wrote:
Colin Johnstone wrote:
I firstly need to remove any invalid characters (including
spaces)
$filename = 'News & events';
$filename =~
s/[\w\&%'[EMAIL PROTECTED](\)&_\\+,\.=\[\]]//g;
then convert it to lowercase.
$filename =~ s/[^A-Za-z]//g; #use only alpha chars
print lc($filename); # convert to lower case
# or if you want to strip out non printing (control) chars:
$filename = " News \r \n \x01 even ts";
$filename =~ s/[[:cntrl:]]//g;
$filename =~ s/\s//g;
rather than re-invent the wheel I would prefer if you could fix
this regex I believe it covers all invalid characters one would
encounter
s/[\w\&%'[EMAIL PROTECTED](\)&_\\+,\.=\[\]]//g;
In this case, re-inventing the wheel, as you put it, is much more
convenient than using that regex as the starting-point for solving
your problem.
I would then use it as a general purpose regex for validating
filenames.
The main problem is that your approach, i.e. trying to identify
"invalid characters", is not the best choice. Generally it is
advisable to do it the other way around: Decide which characters you
want to accept, and remove the rest.
Drop the regex you "found on the web" and start listen to the
suggestions given.
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help

2004-09-14 Thread Jenda Krynicky

From: "Johnstone, Colin" <[EMAIL PROTECTED]>
> rather than re-invent the wheel I would prefer if you could fix this
> regex I believe it covers all invalid characters one would encounter
> 
> s/[\w\&%'[EMAIL PROTECTED](\)&_\\+,\.=\[\]]//g;
> 
> I would then use it as a general purpose regex for validating
> filenames.

Never ever "remove invalid characters". Always "remove everything 
except the safe characters". What if someone sends you a newline? Or 
a character with code 0? Or ...

Your regexp makes very little sense. You definitely should go read 
perlretut or something. The [] denotes a character class, there is no 
difference whatsoever between [] and [<>ups]!

Jenda
= [EMAIL PROTECTED] === http://Jenda.Krynicky.cz =
When it comes to wine, women and song, wizards are allowed 
to get drunk and croon as much as they like.
-- Terry Pratchett in Sourcery


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help

2004-09-14 Thread Chris Devers

On Tue, 14 Sep 2004, Johnstone, Colin wrote:

> I found this regex by searching on google and I assumed the guy who 
> wrote it knew more than me.

>From the look of this regex, that ain't necessarily so.

The thing with regular expressions is that they tend to be crafted for 
very specific purposes, and unless this guy was trying to solve the 
exact same problem you are, the regex he wrote may not do what you need.

> I guess I was looking for a general purpose regex to remove invalid 
> characters from filenames to add to my developers toolkit to take from 
> project to project.

Shouldn't a toolkit consist of things you actually understand. 

It's like the old "give a man a fish" saying -- if as a result of this 
discussion you come away with a regex that solves this specific problem 
today, then you will be able to eat today. If on the other hand you 
start learning how regular expressions work, you'll be able to write 
your own ones and won't have to rely on questionable canned examples.

> Specifically for this project I would like to remove spaces and 
> ampersands apostrophes, quotes should users enter them.

That's nice.

As others asked you, what would you like to *keep* ? 

This regular expression can be much more simply expressed in terms of 
matching everything other than that which you know you want. You 
probably want letters, digits, and maybe some punctuation, right? Try to 
list all the characters that are valid in filenames on all the platforms 
that Teamsite currently or could potentially run on, and then you'll be 
most of the way to the regex you want. 

Also, if you just want to strip out bad characters a substitution regex 
(one with s/.../.../ syntax) may be much more complicated than a simple 
tr/...// statement. Have you tried doing it that way ?

-- 
Chris Devers

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help

2004-09-14 Thread Jim

  
> So in short the regex must remove any characters from the 
> anchor text e.g the & in 'News & events'
> and any spaces, basically any characters that cannot be used 
> in a filename if they have been inadvertently entered by the users.

What kind of file names? DOS, UNIX, Other? 

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.745 / Virus Database: 497 - Release Date: 8/27/2004
 


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: regex help

2004-09-23 Thread Johnstone, Colin

I worked it out myself

  $safeString = "News & Events";
$safeString =~ s!&!and!g;
  $safeString =~ s!\s!_!g;

thank you
Colin

-Original Message-
From: Johnstone, Colin 
Sent: Friday, September 24, 2004 3:32 PM
To: [EMAIL PROTECTED]
Subject: regex help


Gidday all,

Im trying to write a regex to convert spaces to underscores and ampersands to 'and'  
can someone help.

$safeString = "News & Events";
$safeString =~ s/&/and/g;
$safeString =~ s/\s/_/g;

Regards

Colin 


This E-Mail is intended only for the addressee. Its use is limited to that
intended by the author at the time and it is not to be distributed without the
author's consent. Unless otherwise stated, the State of Queensland accepts no
liability for the contents of this E-Mail except where subsequently confirmed in
writing. The opinions expressed in this E-Mail are those of the author and do
not necessarily represent the views of the State of Queensland. This E-Mail is
confidential and may be subject to a claim of legal privilege.

If you have received this E-Mail in error, please notify the author and delete this 
message immediately.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: regex help

2004-09-23 Thread Ramprasad A Padmanabhan

On Fri, 2004-09-24 at 11:01, Johnstone, Colin wrote:
> Gidday all,
> 
> Im trying to write a regex to convert spaces to underscores and ampersands to 'and'  
> can someone help.
> 
> $safeString = "News & Events";
> $safeString =~ s/&/and/g;
> $safeString =~ s/\s/_/g;
> 
> Regards
> 
> Colin 
> 

What you have written is perfectly fine .. why do you think it will not
work ? 
you may consider however writing '\&' instead of '&' 

Bye
Ram




-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: regex help

2004-09-23 Thread Raymond Raj

Hi!

what's problem in these regular expressions.. every thing correct!

do you need to convert one or more than one match into single replacement
then regexp should be

$safeString =~ s/\s+/_/g;




-Original Message-
From: Johnstone, Colin [mailto:[EMAIL PROTECTED]
Sent: Friday, September 24, 2004 11:02 AM
To: [EMAIL PROTECTED]
Subject: regex help


Gidday all,

Im trying to write a regex to convert spaces to underscores and ampersands
to 'and'  can someone help.

$safeString = "News & Events";
$safeString =~ s/&/and/g;
$safeString =~ s/\s/_/g;

Regards

Colin


This E-Mail is intended only for the addressee. Its use is limited to that
intended by the author at the time and it is not to be distributed without
the
author's consent. Unless otherwise stated, the State of Queensland accepts
no
liability for the contents of this E-Mail except where subsequently
confirmed in
writing. The opinions expressed in this E-Mail are those of the author and
do
not necessarily represent the views of the State of Queensland. This E-Mail
is
confidential and may be subject to a claim of legal privilege.

If you have received this E-Mail in error, please notify the author and
delete this message immediately.


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: regex help

2004-09-23 Thread Johnstone, Colin

Thank you I musthave just coded something wrong, your right they do work. Sorry!

-Original Message-
From: Raymond Raj [mailto:[EMAIL PROTECTED]
Sent: Friday, September 24, 2004 3:55 PM
To: Johnstone, Colin; [EMAIL PROTECTED]
Subject: RE: regex help

Hi!

what's problem in these regular expressions.. every thing correct!

do you need to convert one or more than one match into single replacement
then regexp should be

$safeString =~ s/\s+/_/g;

-Original Message-
From: Johnstone, Colin [mailto:[EMAIL PROTECTED]
Sent: Friday, September 24, 2004 11:02 AM
To: [EMAIL PROTECTED]
Subject: regex help

Gidday all,

Im trying to write a regex to convert spaces to underscores and ampersands
to 'and'  can someone help.

$safeString = "News & Events";
$safeString =~ s/&/and/g;
$safeString =~ s/\s/_/g;

Regards

Colin

This E-Mail is intended only for the addressee. Its use is limited to that
intended by the author at the time and it is not to be distributed without
the
author's consent. Unless otherwise stated, the State of Queensland accepts
no
liability for the contents of this E-Mail except where subsequently
confirmed in
writing. The opinions expressed in this E-Mail are those of the author and
do
not necessarily represent the views of the State of Queensland. This E-Mail
is
confidential and may be subject to a claim of legal privilege.

If you have received this E-Mail in error, please notify the author and
delete this message immediately.

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: Regex help

2005-01-16 Thread Dave Gray

On Sat, 15 Jan 2005 01:25:21 -0800 (PST), Ajey Kulkarni <[EMAIL PROTECTED]> 
wrote:
> I'm trying to match a floating point in ada.
> They are normal floating points with 2 extra things.(they can or can't
> come)
> 
> 1. an underscore is permitted between the digits and
> 2. An alternate numeric base may be specified surrounding the nonexponent
> part of the number with pound signs, precided by a base in decimal.
> 
> Eg: 16#6.a7#e+2, 18.9,

Sounds suspiciously like homework, but that's a fun problem.

__CODE__
#!/usr/bin/perl
use strict;
use warnings;

my @numbers = (
  '16#6.f7#e+2',
  '18.9',
  '2#01013#',
  '16e+2',
);
my @valid   = (0 .. 9, 'a' .. 'z');

for my $num (@numbers) {
  my ($base, $n, $exp);
  if ($num =~ /^(\d+)\#([^\#]*?)\#(?:e\+(\d+))?$/x) {
($base, $n) = ($1, $2);
$exp = defined $3 ? $3 : 1;
  } elsif ($num =~ /^(\d[\d._]*?)(?:e\+(\d+))?$/) {
($base, $n) = (10, $1);
$exp = defined $2 ? $2 : 1;
  }
  next if not $n;
  my $invalid = '[^._'.join('',@valid[0..($base-1)]).']';
  warn "invalid base $base number [$n] detected! ($invalid)\n"
if $n =~ /$invalid/;
  print "got base $base, num $n, exp $exp\n";
}
__END__

That should (more than) get you started!

HTH,
Dave

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help

2005-01-16 Thread Ajey Kulkarni

Huh..
Thanks a ton. I never expected a program. :-)).
regards
-Ajey
On Sun, 16 Jan 2005, Dave Gray wrote:
On Sat, 15 Jan 2005 01:25:21 -0800 (PST), Ajey Kulkarni <[EMAIL PROTECTED]> wrote:
I'm trying to match a floating point in ada.
They are normal floating points with 2 extra things.(they can or can't
come)
1. an underscore is permitted between the digits and
2. An alternate numeric base may be specified surrounding the nonexponent
part of the number with pound signs, precided by a base in decimal.
Eg: 16#6.a7#e+2, 18.9,
Sounds suspiciously like homework, but that's a fun problem.
__CODE__
#!/usr/bin/perl
use strict;
use warnings;
my @numbers = (
 '16#6.f7#e+2',
 '18.9',
 '2#01013#',
 '16e+2',
);
my @valid   = (0 .. 9, 'a' .. 'z');
for my $num (@numbers) {
 my ($base, $n, $exp);
 if ($num =~ /^(\d+)\#([^\#]*?)\#(?:e\+(\d+))?$/x) {
   ($base, $n) = ($1, $2);
   $exp = defined $3 ? $3 : 1;
 } elsif ($num =~ /^(\d[\d._]*?)(?:e\+(\d+))?$/) {
   ($base, $n) = (10, $1);
   $exp = defined $2 ? $2 : 1;
 }
 next if not $n;
 my $invalid = '[^._'.join('',@valid[0..($base-1)]).']';
 warn "invalid base $base number [$n] detected! ($invalid)\n"
   if $n =~ /$invalid/;
 print "got base $base, num $n, exp $exp\n";
}
__END__
That should (more than) get you started!
HTH,
Dave
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help

2005-01-16 Thread manfred

That leads me to a question :-)
__CODE__
#!/usr/bin/perl
use strict;
use warnings;
my @numbers = (
 '16#6.f7#e+2',
 '18.9',
 '2#01013#',
 '16e+2',
);
my @valid   = (0 .. 9, 'a' .. 'z');
for my $num (@numbers) {
 my ($base, $n, $exp);
 if ($num =~ /^(\d+)\#([^\#]*?)\#(?:e\+(\d+))?$/x) {
What particular use has the _x_ modifier in this example?
I mean the hashes are escaped?
--manfred
   ($base, $n) = ($1, $2);
   $exp = defined $3 ? $3 : 1;
 } elsif ($num =~ /^(\d[\d._]*?)(?:e\+(\d+))?$/) {
   ($base, $n) = (10, $1);
   $exp = defined $2 ? $2 : 1;
 }
 next if not $n;
 my $invalid = '[^._'.join('',@valid[0..($base-1)]).']';
 warn "invalid base $base number [$n] detected! ($invalid)\n"
   if $n =~ /$invalid/;
 print "got base $base, num $n, exp $exp\n";
}
__END__
That should (more than) get you started!
HTH,
Dave
-- To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 



--
http://glassdoc.org
http://glassdoc.de
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help

2005-01-17 Thread Dave Gray

On Mon, 17 Jan 2005 08:37:07 +0100, manfred <[EMAIL PROTECTED]> wrote:
> 
> That leads me to a question :-)
> 
> >>  if ($num =~ /^(\d+)\#([^\#]*?)\#(?:e\+(\d+))?$/x) {
> 
> What particular use has the _x_ modifier in this example?
> I mean the hashes are escaped?

I forgot to remove the /x when I stripped the comments from the regex
because if this was homework, I wanted the OP to work to understand it
;)

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: RegEx Help

2001-08-19 Thread Gibbs Tanton - tgibbs


Ok, let me preface this by saying that I know very little about html or web
pages, so what I say might not be entirely correct...but I think I can point
you in the right direction.

First, the part that says
($url = $info) =~ m/.../;
does not do what you want it to( I think ).
I think you want to match the regex against info and then assign the results
of the (.*) to $url.  What yours does is assign $info to $url and then
perform a match against it (the (.*) will be in $1).  To do the former, you
need to type:
($url) = $info =~ m/.../;
m// returns back a list of the parenthetical matches (of which you want the
first one ( and the only one in this case ).

The second problem is that (.*) is going to eat up too much information.  It
will match as far as possible gobbling up quite a few " along the way
until it reaches the very LAST ".  What you want it to do is match the
very FIRST " that it finds.  An easy (and perhaps too simplistic) way of
doing this is to make (.*) match minimally by adding a ? after the *.
(.*?).  This tells .* to match as little as possible and therefore only goes
to the first ".  As I said, this might be too simplistic for what you
want, but it works with the sample data you provided.

Finally, If the __DATA__ section has embeded newlines, you will want to add
the /s flag to the regex to make it see all of the lines...I didn't know if
my mail program added the newlines or if they were there already.

So, the final two lines are:
($url) = $info =~ m/a.*href="(.*?)"/s;
($image) = $info =~ m/img.*src="(.*?)"/s;

These lines work with the data you provided.  However, as your application
gets more complex, you might want to consider using the /g flag to match
multiple hrefs and imgs.  If you do that, you will want to do minimal
matching on the first .* as well or else you will gobble up all of your data
the first time.

Good Luck!
Tanton

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: RegEx Help

2001-08-19 Thread Gibbs Tanton - tgibbs


oh, I just noticed that you do have 2 img statements in your data.  You'll
have to use the /g regex modifier to catch them all.

@images = $info =~ /img.*?src="(.*?)"/sg;

Now, each element of @images will contain one of the parenthetical matches.
So, in your example, scalar(@images) == 2 and $images[0] equals the first
image matched while $images[1] is the second one matched.

Notice the minimal match after the img so that not too much is gobbled up.

Good Luck!
Tanton


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help!

2002-07-03 Thread John W. Krahn


Timothy Johnson wrote:
> 
>   m/(^.+\.gif|^.+\.jpg)/
> 
> which would still be wrong, because you can't put anchors and such in a
> regex, but if we move everything except the characters we want to compare
> out of the (|), we get:

Actually, you can put anchors in a regular expression.  Your example
above is a valid regular expression in Perl.


John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help!

2002-07-03 Thread Janek Schleicher


Troy May wrote at Wed, 03 Jul 2002 07:58:10 +0200:

> Hello, I'm trying to match images by the extension.  Either .jpg or .gif
> 
> Here's what I wrote:
> 
> $plup =~ m/\(\^\.\.gif\|\^\.\.jpg)/\
> 
> Is it all screwed up?  I'm not real good with regexes.

$plup =~ /\.(gif|jpg)$/;


Greetings,
Janek

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex Help

2002-07-08 Thread Peter Scott


At 09:37 AM 7/8/02 -0500, Daryl J. Hoyt wrote:
>Hi,
> I am trying to write a script to kill all processes named $string.

Does your system not have killall?

NAME
killall - kill processes by name

SYNOPSIS
killall [-e,--exact] [-g,--process-group] [-i,--interac
tive] [-q,--quiet] [-v,--verbose] [-w,--wait] [-V,--ver
sion] [-s,--signal signal] [--] name ...
killall -l
killall -V,--version

DESCRIPTION
killall sends a signal to all processes running any of the
specified  commands.  If  no  signal  name  is  specified,
SIGTERM is sent.
[...]

>I am
>not sure how to handle the regex the right way.  Here is a sample of the
>command:
>
>djh  17893 17892  0 Jul03 pts/16   00:00:00 rlogin test3
>djh   6401 25628  0 Jul05 ?00:00:00 [rhn-applet ]
>djh   6525 25746  0 Jul05 pts/100:00:00 bash
>djh   6530  6525  0 Jul05 pts/100:00:00 rlogin xeon
>djh   6531  6530  0 Jul05 pts/100:00:00 rlogin xeon
>djh   6828 25746  0 Jul05 pts/900:00:00 bash
>djh   6833  6828  0 Jul05 pts/900:00:00 rlogin xeon
>djh   6834  6833  0 Jul05 pts/900:00:11 rlogin xeon
>djh   7292 26480  0 Jul05 pts/10   00:00:00 rlogin spot
>djh   7293  7292  0 Jul05 pts/10   00:00:00 rlogin spot
>djh   7361 25746  0 Jul05 pts/12   00:00:00 bash
>djh   7366  7361  0 Jul05 pts/12   00:00:00 rlogin spot
>djh   7367  7366  0 Jul05 pts/12   00:00:00 rlogin spot
>djh   7466 25746  0 Jul05 pts/14   00:00:00 bash
>djh   1612 25628  0 08:49 ?00:00:00 [rhn-applet ]
>djh   1639 25746  0 08:50 pts/17   00:00:00 bash
>djh   1647  1639  0 08:51 pts/17   00:00:00 rlogin duke
>djh   1648  1647  0 08:51 pts/17   00:00:00 rlogin duke
>djh   1742 1  0 09:10 ?00:00:00 gvim process_killer.pl
>djh   1778  7466  0 09:20 pts/14   00:00:00 /usr/local/bin/perl
>../process_kidjh   1779  1778  0 09:20 pts/14   00:00:00 sh -c ps -ef |
>grep djh
>djh   1780  1779  0 09:20 pts/14   00:00:00 ps -ef
>djh   1781  1779  0 09:20 pts/14   00:00:00 grep djh
>
>I want to strip out the fist columns of numbers (the PIDs).  How would I do
>this if each line of this is an element in an array?  Any help would be
>appreciated.

Just parse it a line at a time.  Split the line on white space; the pid 
is the second field.  The command may contain white space but nothing 
before it can, so get the command by joining on space everything from 
the first column of the command thru the last column.

for (`$PS`) {
   my @cols = split;
   my $pid = $cols[1];
   my $command = join ' ' => @cols[7 .. $#cols];
   kill TERM => $pid if $command eq $string;   ###
}

Might as well make it so you can use a regex in $string.  Say that if 
$string is enclosed in /.../, it means it's a regex.  Replace the ### line with

   my $regex = $string;
   kill TERM => $pid if ($regex =~ s#^/(.+)/$##) ? $command =~ /$regex/
  : $command eq $string;

Um, maybe that's a bit abbreviated for a beginners' list :-)  Alternatively:

   my $killit;
   my $regex = $string;
   if ($regex =~ s#^/(.+)/$##) {
 $killit = $command =~ /$regex/;
   }
   else {
 $killit = $command eq $string;
   }
   kill TERM => $pid if $killit;

I suppose there might be some attraction to killing all the processes 
at once, in which case instead of killing them, you add the $pid to an 
array @pids and once done with the loop do 'kill TERM => @pids if @pids'.
--
Peter Scott
Pacific Systems Design Technologies
http://www.perldebugged.com/


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help!!

2002-12-01 Thread Beau E. Cox

Hi -

Could you please explain what you get?

-Original Message-
From: Johnstone, Colin [mailto:[EMAIL PROTECTED]]
Sent: Sunday, December 01, 2002 8:06 PM
To: '[EMAIL PROTECTED]'
Subject: FW: Regex help!!

Further to my previous message.

after fixing the ~= I get the following.

- the following what? ---

I know $screenOutput has a value because I see the html contained in the
file Im reading on the screen its just not doing the substitution.

please advise.

Colin

-Original Message-
From: Johnstone, Colin
Sent: Monday, December 02, 2002 17:00
To: '[EMAIL PROTECTED]'
Subject: Regex help!!

Hi all,

Im reading in a file line by line and I want to look for the occurence of
this string
 in $line. Is this the right way to do it. If it finds
it it is to inser the value of $screenOutput in its place.

if($screenOutput ne ""){
my $responsePage = "";
open(IN, "<$locationResponsePageSkin") or die("Cannot open: $!");
while( my $line = ){
$line ~= s//$screenOutput/;
$responsePage .= $line;
}
close(IN);
print $responsePage;
}

Colin Johnstone
Website Project Officer
Corporate Website Unit
Public Affairs Directorate
ph 9561 8643

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help!!

2002-12-01 Thread Johnstone, Colin

Thanks Danny

I must be doing something else wrong, I bet its variable scope, Im just getting my 
head around it


Colin

-Original Message-
From: Danny Miller [mailto:[EMAIL PROTECTED]]
Sent: Monday, December 02, 2002 17:13
To: Johnstone, Colin
Subject: RE: Regex help!!


Your regex is right, the substitution works...on my system at least.

Regards,

Danny

-Original Message-
From: Johnstone, Colin [mailto:[EMAIL PROTECTED]]
Sent: Monday, December 02, 2002 1:06 AM
To: '[EMAIL PROTECTED]'
Subject: FW: Regex help!!


Further to my previous message.

after fixing the ~= I get the following.

I know $screenOutput has a value because I see the html contained in the
file Im reading on the screen its just not doing the substitution.

please advise.

Colin 

-Original Message-
From: Johnstone, Colin 
Sent: Monday, December 02, 2002 17:00
To: '[EMAIL PROTECTED]'
Subject: Regex help!!


Hi all,

Im reading in a file line by line and I want to look for the occurence of
this string 
 in $line. Is this the right way to do it. If it finds
it it is to inser the value of $screenOutput in its place.

if($screenOutput ne ""){
my $responsePage = "";
open(IN, "<$locationResponsePageSkin") or die("Cannot open: $!");
while( my $line = ){
$line ~= s//$screenOutput/;
$responsePage .= $line;
}
close(IN);
print $responsePage;
}


Colin Johnstone 
Website Project Officer 
Corporate Website Unit 
Public Affairs Directorate 
ph 9561 8643 

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help!!

2002-12-02 Thread Rob Dixon

Colin

Removing some of the trees so that the wood is visible:

if ($screenOutput)
{
open (IN, "< $locationResponsePageSkin") or die("Cannot open: $!");
while()
{
s//$screenOutput/;
print;
}
close IN;
}

HTH,

Rob

- Original Message -
From: "Johnstone, Colin" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, December 02, 2002 6:00 AM
Subject: Regex help!!


> Hi all,
>
> Im reading in a file line by line and I want to look for the occurence of
this string
>  in $line. Is this the right way to do it. If it
finds it it is to inser the value of $screenOutput in its place.
>
> if($screenOutput ne ""){
> my $responsePage = "";
> open(IN, "<$locationResponsePageSkin") or die("Cannot open: $!");
> while( my $line = ){
> $line ~= s//$screenOutput/;
> $responsePage .= $line;
> }
> close(IN);
> print $responsePage;
> }
>
>
> Colin Johnstone
> Website Project Officer
> Corporate Website Unit
> Public Affairs Directorate
> ph 9561 8643
>
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: regex help

2002-12-13 Thread Mystik Gotan

$var =~ s/^\,$/|/;



--
Bob Erinkveld (Webmaster Insane Hosts)
www.insane-hosts.net
MSN: [EMAIL PROTECTED]






From: Danny Grzenda <[EMAIL PROTECTED]>
To: "'[EMAIL PROTECTED]'" <[EMAIL PROTECTED]>
Subject: regex help
Date: Thu, 12 Dec 2002 12:52:59 -0600


DB4515C,625.25,378,327,382,352,805,163,513.5,699,257.88,,,"4,503","1,801",80
5

Trying to create a regex to substitute comas with pipes, except for the
commas between the double quotes.

can't get one to work.

thanks















Custom Technology Solutions
http://www.ctssys.com

Daniel G. Grzenda
Application Programmer/Intranet Developer
262 N. Sam Houston Parkway E.
Suite 200
Houston, Texas 77060
281-765-6283
Fax 281-765-6273
[EMAIL PROTECTED]






--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



_
Chatten met je online vrienden via MSN Messenger. http://messenger.msn.nl/


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: regex help

2002-12-13 Thread Adriano Rodrigues Ferreira


>
>DB4515C,625.25,378,327,382,352,805,163,513.5,699,257.88,,,"4,503","1,801",805
>
>Trying to create a regex to substitute comas with pipes, except for the
>commas between the double quotes.

I think there is no regex which makes such a miracle. Maybe this has to do with the
fact that regex are related to context-free languages (in language theory terms)
and this kind of problem demands context-dependency (this is what gives comma
inside and outside quotes different meanings). But I am not absolutely right about this
and besides, Perl regexes are full with extensions which probably make it
able to recognize more complex languages. But this is just philosophical speculation.

I've tried to do something like this a little time ago,
struggling with some old programs which have not been properly built.

split won't don't it also because it cannot guess if it is inside or outside quotes.
And split relies on regexes. But maybe you can try to split with a special algorithm
and then make a join. For example, the following works.

# this splits a delimited string of unquoted fields and quoted fields
# delimiter is ',' and quotes are '"'
sub specialsplit {
my ($string) = @_;
my @array = ();
while ($string =~ s/(".*?"|[^,]*),//) {
push @array, $1;
}
push @array, $string unless ($string =~ /,$/);
return @array;
}

And then with

my $sample = 
q{DB4515C,625.25,378,327,382,352,805,163,513.5,699,257.88,,,"4,503","1,801",805};

print join "|", specialsplit($sample);

you get what you asked for.

You'll have problems if you are to encounter quotes inside quotes.

(This is just an ugly solution. A more proficient Perl programmer would probably solve
it with more elegance.)

Hope this helps.

Adriano.




-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: regex help

2002-12-13 Thread Jenda Krynicky

From: Adriano Rodrigues Ferreira <[EMAIL PROTECTED]>
> >DB4515C,625.25,378,327,382,352,805,163,513.5,699,257.88,,,"4,503","1,
> >801",805
> >
> >Trying to create a regex to substitute comas with pipes, except for
> >the commas between the double quotes.
> 
> I think there is no regex which makes such a miracle. Maybe this has
> to do with the fact that regex are related to context-free languages
> (in language theory terms) and this kind of problem demands
> context-dependency (this is what gives comma inside and outside quotes
> different meanings). But I am not absolutely right about this and
> besides, Perl regexes are full with extensions which probably make it
> able to recognize more complex languages. But this is just
> philosophical speculation.

Yeah Perl regexps are pretty unreg.

I think you want something like this:

s/("[^"]*"|[^,]+)|,/defined $1 ? $1 : '|'/ge;
or
s/("[^"]*"|.)/$1 eq ',' ? '|' : $1/ge;

This assumes that to escape a doublequote inside doublequotes you 
double it:

"this ""thing"" is just one field!","an this is the second"

If you'd wanted to support 
"this \"thing, thong\" is just one field!","an this is the second"
you'd have to use this:

s/("(?:[^"\\]|\\.)*"|[^,]+)|,/defined $1 ? $1 : '|'/ge;
or
s/("(?:[^"\\]|\\.)*"|.)/$1 eq ',' ? '|' : $1/ge;

HTH, Jenda
= [EMAIL PROTECTED] === http://Jenda.Krynicky.cz =
When it comes to wine, women and song, wizards are allowed 
to get drunk and croon as much as they like.
-- Terry Pratchett in Sourcery


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: regex help

2002-12-13 Thread John W. Krahn

Danny Grzenda wrote:
> 
> DB4515C,625.25,378,327,382,352,805,163,513.5,699,257.88,,,"4,503","1,801",80 5
> 
> Trying to create a regex to substitute comas with pipes, except for the
> commas between the double quotes.
> 
> can't get one to work.

You have to parse the data.  Here is some code modified from the perlop manpage:

$ perl -e'
$_ = qq[DB4515C,625.25,378,327,382,352,805,163,513.5,699,257.88,,,"4,503","1,801",80 
5\n];
print;
LOOP: {
redo LOOP if /\G"[^"]*"/gc; # bypass quoted strings
redo LOOP if s/\G,/|/gc;# change commas to pipes
redo LOOP if /\G[^,]/gc;# ignore other characters
}
print;
'
DB4515C,625.25,378,327,382,352,805,163,513.5,699,257.88,,,"4,503","1,801",80 5
DB4515C|625.25|378|327|382|352|805|163|513.5|699|257.88|||"4,503"|"1,801"|80 5



John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: regex help

2002-12-16 Thread Danny Grzenda

Thanks everyone for the help. It worked like a charm. Not actually sure how
it works. Never used 
\G b4. My regex books says its not used very much.

Thanks again.

-Original Message-
From: John W. Krahn [mailto:[EMAIL PROTECTED]]
Sent: Friday, December 13, 2002 6:38 PM
To: [EMAIL PROTECTED]
Subject: Re: regex help


Danny Grzenda wrote:
> 
>
DB4515C,625.25,378,327,382,352,805,163,513.5,699,257.88,,,"4,503","1,801",80
5
> 
> Trying to create a regex to substitute comas with pipes, except for the
> commas between the double quotes.
> 
> can't get one to work.

You have to parse the data.  Here is some code modified from the perlop
manpage:

$ perl -e'
$_ =
qq[DB4515C,625.25,378,327,382,352,805,163,513.5,699,257.88,,,"4,503","1,801"
,80 5\n];
print;
LOOP: {
redo LOOP if /\G"[^"]*"/gc; # bypass quoted strings
redo LOOP if s/\G,/|/gc;# change commas to pipes
redo LOOP if /\G[^,]/gc;# ignore other characters
}
print;
'
DB4515C,625.25,378,327,382,352,805,163,513.5,699,257.88,,,"4,503","1,801",80
5
DB4515C|625.25|378|327|382|352|805|163|513.5|699|257.88|||"4,503"|"1,801"|80
5



John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: regex help

2004-05-28 Thread Roberto Etcheverry



On Fri, 28 May 2004, Andrew Gaffney wrote:

> I'm trying to write a regex to parse the following data. Each group is a string
> to parse.
>
> 05/28/04
> 
> Purchase With Pin Pin
> 
> $10.00(pending) href='javascript: ShowHelp("PENDING TRANSACTION")'> border="0">
> $1,224.45
>
> 05/27/04
> 
> Purchase With Pin Shell Service Stlake
> St. Loumo
> 
> $1.78
> $1,234.45
>
> 05/21/04
> 
> Atm Withdrawal One O'fallon Squo'fallon
> Mo 1
> 
> $20.00
> $
> 1,134.79
>
> This is the regex I put together:
>
>  my $regex = ']+?>(\d{2})/(\d{2})/(\d{2}).+?';
>  $regex   .= ']+?>(.*?).+?';
>  $regex   .= ']+?>(.+?).+?';
>  $regex   .= ']+?>(?:\$(\d+\.\d{2})).*?.+?';
>  $regex   .= ']+?>(?:\$(\d+\.\d{2})).*?.+?';
>  $regex   .= ']+?>.*?(?:\$(\d+\.\d{2})).*?';
>
> The first field will always be in the form 'mm/dd/yy'. The second and third
> field need to be captured as they are. As for the fourth and fifth fields, only
> one will contain a value. The other one will be empty (nothing between
> ). The format is '$123.45' with the possibility of trailing HTML before
> the . I only want the number without the $. The sixth field will contain a
> dollar amount like the fourth and fifth fields. It could be surrounded by HTML.
> Again, I only need the number without the $. What is wrong with the above regex?
> I am using it with the 's' modifier.
>
> --
> Andrew Gaffney
> Network Administrator
> Skyline Aeronautics, LLC.
> 636-357-1548
>
>
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>  
>
>
>
It seems two things are missing:

1) A '?' after the 4th and  5th group (because they may be empty).
2) Include ',' on the regex matching the amounts (to match '1,234.45' for
example).

So the regex would be:

my $regex = ']+?>(\d{2})/(\d{2})/(\d{2}).+?';
$regex   .= ']+?>(.*?).+?';
$regex   .= ']+?>(.+?).+?';
$regex   .= ']+?>(?:\$([\d,]+\.\d{2}))?.*?.+?';
$regex   .= ']+?>(?:\$([\d,]+\.\d{2}))?.*?.+?';
$regex   .= ']+?>.*?(?:\$([\d,]+\.\d{2})).*?';


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: regex help

2004-05-28 Thread Andrew Gaffney

Roberto Etcheverry wrote:
On Fri, 28 May 2004, Andrew Gaffney wrote:

I'm trying to write a regex to parse the following data. Each group is a string
to parse.
05/28/04

Purchase With Pin Pin

$10.00(pending)
$1,224.45
05/27/04

Purchase With Pin Shell Service Stlake
St. Loumo

$1.78
$1,234.45
05/21/04

Atm Withdrawal One O'fallon Squo'fallon
Mo 1

$20.00
$
1,134.79
This is the regex I put together:
my $regex = ']+?>(\d{2})/(\d{2})/(\d{2}).+?';
$regex   .= ']+?>(.*?).+?';
$regex   .= ']+?>(.+?).+?';
$regex   .= ']+?>(?:\$(\d+\.\d{2})).*?.+?';
$regex   .= ']+?>(?:\$(\d+\.\d{2})).*?.+?';
$regex   .= ']+?>.*?(?:\$(\d+\.\d{2})).*?';
The first field will always be in the form 'mm/dd/yy'. The second and third
field need to be captured as they are. As for the fourth and fifth fields, only
one will contain a value. The other one will be empty (nothing between
). The format is '$123.45' with the possibility of trailing HTML before
the . I only want the number without the $. The sixth field will contain a
dollar amount like the fourth and fifth fields. It could be surrounded by HTML.
Again, I only need the number without the $. What is wrong with the above regex?
I am using it with the 's' modifier.
It seems two things are missing:
1) A '?' after the 4th and  5th group (because they may be empty).
2) Include ',' on the regex matching the amounts (to match '1,234.45' for
example).
So the regex would be:
my $regex = ']+?>(\d{2})/(\d{2})/(\d{2}).+?';
$regex   .= ']+?>(.*?).+?';
$regex   .= ']+?>(.+?).+?';
$regex   .= ']+?>(?:\$([\d,]+\.\d{2}))?.*?.+?';
$regex   .= ']+?>(?:\$([\d,]+\.\d{2}))?.*?.+?';
$regex   .= ']+?>.*?(?:\$([\d,]+\.\d{2})).*?';
Ah, thank you. Those changes worked.
--
Andrew Gaffney
Network Administrator
Skyline Aeronautics, LLC.
636-357-1548
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex Help

2003-08-01 Thread ss004b3324

Hi,

I have a number of strings, in the following format:
A deep resonant "ooh-hu" with emphasis on the
first syllable.
from which I wish to extract the following part of the string using a regex:
ooh-hu
I have been trying to no avail - and would very much appreciate some help
with this problem.

TIA,

Shaun
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.505 / Virus Database: 302 - Release Date: 30/07/2003


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: regex. help

2003-09-24 Thread Damon Davison

On Wednesday 24 September 2003 10:54, Pandey Rajeev-A19514 wrote 
:  I have a question regarding matching strings that have
: embeded perl special characters.
...
:  Can some one tell me an easy way to do ?

Have a look at \Q and \E.  They're both listed in perldoc perlre 
under the main "Regular Expressions" heading.

m/\Q.*\E./

matches the literal string .*, followed by any character.

Cheers, 

Damon

-- 

Damon Allen Davison
[EMAIL PROTECTED]

"A UNIX life is hard."


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: regex. help

2003-09-24 Thread Sudarshan Raghavan

Pandey Rajeev-A19514 wrote:

Hi ,

I have a question regarding matching strings that have embeded perl special characters.

$abc = 'MC10.G(12)3c';

$efg = 'MC10.G(12)3c';

Now I want to check whether they are same or not by doing

if ($abc =~ /$efg/) { do something;}

Why not use the 'eq' operator
if ($abc eq $efg) {#do something;}
For this I need to insert a '\' before every special char. 

For academic interest, you can escape the special characters during 
matching using \Q and \E
if ($abc =~ /\Q$efg\E/) { #do something;}

Can some one tell me an easy way to do ?

Regards
Rajeev
 



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: regex. help

2003-09-24 Thread Thomas Bätzler

Hi,

Pandey Rajeev-A19514 [mailto:[EMAIL PROTECTED] asked:
> I have a question regarding matching strings that have 
> embeded perl special characters.
> 
> $abc = 'MC10.G(12)3c';
> $efg = 'MC10.G(12)3c';
> 
> Now I want to check whether they are same or not by doing
> if ($abc =~ /$efg/) { do something;}

Why not use $abc eq $efg if you only want to check for
equality?

> For this I need to insert a '\' before every special char. 
> Can some one tell me an easy way to do ?

You can also use the qr operator to quote metacharacters
in a string or scalar value:

my $re = qr( $efg );

if( $abc =~ /$re/ ) {...}

See the section "Regexp Quote-Like Operators" in the perlre
manpage for details on qr().

HTH,
Thomas


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex Help

2002-03-12 Thread Hanson, Robert


Maybe something like this:

@line = split /:/, $theLineOfData;
@line = map { s/^'(.*)'$/$1/ } (@line); # removes the ticks

And to match the whole word:

if ( $field =~ /\bBRANCH\b/ ) {
# matches word boundary
}

Or you could remove the whitespace as well to simply things...

@line = split /:/, $theLineOfData;
@line = map { s/^'\s*(.*?)\s*'$/$1/ } (@line); # removes the ticks

if ( $line[0] eq 'BRANCH' ) {
# do stuff
}

Rob

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, March 12, 2002 10:41 AM
To: [EMAIL PROTECTED]
Subject: Regex Help


Here is snippet of data:

'BRANCH  ':'Kurt':'Strothenke'
'BRANCH  ':'Michael':'Mulligan'
'BRANCH_SSC  ':'Kevin':'Oaks'
'BRANCH_SSC  ':'Thomas':'Grove'
'BRANCH_SSC  ':'Stephen':'Orban'
'BRANCH_SSC  ':'Gerald':'Parnell'
'BRANCH_SSC  ':'Liane':'Mcintyre'
'BRANCHADMIN ':'Ann':'White'
'BRANCHADMIN ':'Brent':'Uhl'

2 problems:

1.  I want to remove the tickmarks from fields 1,2,3 and 
put the data into an array using the : as a separator.  
I am having trouble getting rid of the tick marks, 
everything else works fine.

2.  I want to match whole words only.  I don't want to 
BRANCH to match on BRANCH_SSC or BRANCHADMIN.  My code 
currently matches BRANCH with all three. 

Thanks for your help

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex Help

2002-03-12 Thread Russ Foster


or you can remove the tick marks with the code...

$line =~ s/'//g ;


-rjf


> -Original Message-
> From: Hanson, Robert [mailto:[EMAIL PROTECTED]] 
> Sent: Tuesday, March 12, 2002 09:45
> To: '[EMAIL PROTECTED]'; [EMAIL PROTECTED]
> Subject: RE: Regex Help
> 
> 
> Maybe something like this:
> 
> @line = split /:/, $theLineOfData;
> @line = map { s/^'(.*)'$/$1/ } (@line); # removes the ticks
> 
> And to match the whole word:
> 
> if ( $field =~ /\bBRANCH\b/ ) {
>   # matches word boundary
> }
> 
> Or you could remove the whitespace as well to simply things...
> 
> @line = split /:/, $theLineOfData;
> @line = map { s/^'\s*(.*?)\s*'$/$1/ } (@line); # removes the ticks
> 
> if ( $line[0] eq 'BRANCH' ) {
>   # do stuff
> }
> 
> Rob
> 
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, March 12, 2002 10:41 AM
> To: [EMAIL PROTECTED]
> Subject: Regex Help
> 
> 
> Here is snippet of data:
> 
> 'BRANCH  ':'Kurt':'Strothenke'
> 'BRANCH  ':'Michael':'Mulligan'
> 'BRANCH_SSC  ':'Kevin':'Oaks'
> 'BRANCH_SSC  ':'Thomas':'Grove'
> 'BRANCH_SSC  ':'Stephen':'Orban'
> 'BRANCH_SSC  ':'Gerald':'Parnell'
> 'BRANCH_SSC  ':'Liane':'Mcintyre'
> 'BRANCHADMIN ':'Ann':'White'
> 'BRANCHADMIN ':'Brent':'Uhl'
> 
> 2 problems:
> 
> 1.  I want to remove the tickmarks from fields 1,2,3 and 
> put the data into an array using the : as a separator.  
> I am having trouble getting rid of the tick marks, 
> everything else works fine.
> 
> 2.  I want to match whole words only.  I don't want to 
> BRANCH to match on BRANCH_SSC or BRANCHADMIN.  My code 
> currently matches BRANCH with all three. 
> 
> Thanks for your help
> 
> -- 
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> -- 
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: regex help

2002-04-02 Thread Me


> i have a date string formatted this way:  $date = "Apr. 02, 2002"
> i want to capture the "Apr" (without the period) into $1, "02" into
$2, and
> "2002" into $3.
> note that some months may not be abbreviated (such as May), and
therefore
> might not have a period.
> my regex skills are still sadly inferior (though i'm learning =)

Hmm. What doc are you learning from? What have you tried
for the above?

> my second (and optional) problem for you to solve: i want to transform
this
> into mysql date format (-mm-dd). this is easy enough to do, but i
was
> wondering if there are any modules there that can date manipulation a
lot
> easier. if you know any, just let me know.

Perl's got a huge community supported module library
called cpan. For more info, enter at a command prompt:

perl -q cpan

As this doc will tell you, there's a search.cpan.org.
With it you can easily browse/search the entire library.

hth

--
ralph


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: regex help

2002-04-02 Thread Jason Larson


> -Original Message-
> From: Gabby Dizon [mailto:[EMAIL PROTECTED]]
> Subject: regex help
> 
> hello list friends,

Hello.
> 
> hope you can help me with my problem. here it is:
> 
> i have a date string formatted this way:  $date = "Apr. 02, 2002"
> i want to capture the "Apr" (without the period) into $1, 
> "02" into $2, and "2002" into $3.
> note that some months may not be abbreviated (such as May), 
> and therefore might not have a period.
> my regex skills are still sadly inferior (though i'm learning =)

  $date =~ /(\w+)\.? (\d+), (\d+)/;

Bear in mind that this solution assumes that the format is always the same
(meaning exactly one space between each value you want to store.  Also be
aware that $1, $2, $3, etc. are read-only variables, so it would (probably)
be better to assign the values to another variable right away.

  my ($Month, $Date, $Year) = $date =~ /(\w+)\.? (\d+), (\d+)/;

> 
> my second (and optional) problem for you to solve: i want to 
> transform this
> into mysql date format (-mm-dd). this is easy enough to 
> do, but i was
> wondering if there are any modules there that can date 
> manipulation a lot
> easier. if you know any, just let me know.
> 
> thanks a lot!
> 
> Gabby Dizon
> Web Developer
> Inq7 Interactive, Inc.

Your best option is to check CPAN for date manipulation modules.

Hope this helps...
Jason


CONFIDENTIALITY NOTICE:



The information contained in this ELECTRONIC MAIL transmission
is confidential.  It may also be privileged work product or proprietary
information. This information is intended for the exclusive use of the
addressee(s).  If you are not the intended recipient, you are hereby
notified that any use, disclosure, dissemination, distribution [other
than to the addressee(s)], copying or taking of any action because
of this information is strictly prohibited.



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help!

2002-07-02 Thread David . Wagner


if (  $plup =~ /\.(gif|jpg)$/i ) {
   #
   #  This is success(ie, name ends with a . and either gif or jpg)
Ignore case 
   #
 }else {
   # not a gif or jpg
 }
Wags ;)
-Original Message-
From: Troy May [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, July 02, 2002 22:58
To: Perl Beginners
Subject: Regex help!


Please...  :)

Hello, I'm trying to match images by the extension.  Either .jpg or .gif

Here's what I wrote:

$plup =~ m/\(\^\.\.gif\|\^\.\.jpg)/\

Is it all screwed up?  I'm not real good with regexes.

Thanks if you can help me!

Troy

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help!

2002-07-02 Thread Toby Stuart


#!perl.exe -w

use strict;

my $filename = 'image.gif';
#my $filename = 'image.jpg';

if ($filename =~ m/^.*\.(jpg|gif)$/)
{
print "Got one!";
}

---

hth

toby

-Original Message-
From: Troy May [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, July 03, 2002 3:58 PM
To: Perl Beginners
Subject: Regex help!


Please...  :)

Hello, I'm trying to match images by the extension.  Either .jpg or .gif

Here's what I wrote:

$plup =~ m/\(\^\.\.gif\|\^\.\.jpg)/\

Is it all screwed up?  I'm not real good with regexes.

Thanks if you can help me!

Troy

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help!

2002-07-02 Thread Timothy Johnson


On top of the answers given, let me critique your regex a bit.  

First off, be aware that the m at the beginning of your regular expression
is implied, and can be left off.  It's okay to put it there if it is easier
for you to read, but don't be thrown off if people leave it out.

Secondly,let's look at your regex.  It basically says...

m/\(\^\.\.gif\|\^\.\.jpg)/\

m/  Match
\(  one (
\^  followed by '^'
\.\.followed by '..'
gif followed by 'gif'
\|  followed by '|'
\^  followed by '^'
\.\.followed by '..'
jpg followed by 'jpg'  
)   (a right parenthesis with no matching left)
/   End Match
\   (an extra '\')

The main thing that I see here is that you seem to be escaping almost every
character.  Remember:  characters immediately following a backslash lose
their special meaning in a regex.  The above regex would sort of match the
following string:  '(^..gif|^..jpg'.  But what I think you were trying to do
(correct me if I'm wrong) was something more like this:

  m/(^.+\.gif|^.+\.jpg)/

which would still be wrong, because you can't put anchors and such in a
regex, but if we move everything except the characters we want to compare
out of the (|), we get:

  m/^.+\.(gif|jpg)$/i

which means roughly this:

m/Beginning of match
^ Starting at the beginning of the string
..+Match one or more of any character
\.Followed by a '.'
(gif|jpg) Followed by either 'gif' or 'jpg'
$ Followed by the end of the string
/ End of match
i Match upper or lower case letters

One final note:  I chose .+ instead of .* (match zero or more of any
character) because legitemate picture files should have a filename and an
extension, but it's up to you whether you want to match a file simply called
'.gif' or '.jpg'.  It could happen.


-Original Message-
From: Toby Stuart
To: '[EMAIL PROTECTED]'
Sent: 7/2/02 11:07 PM
Subject: RE: Regex help!

#!perl.exe -w

use strict;

my $filename = 'image.gif';
#my $filename = 'image.jpg';

if ($filename =~ m/^.*\.(jpg|gif)$/)
{
print "Got one!";
}

---

hth

toby

-Original Message-
From: Troy May [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, July 03, 2002 3:58 PM
To: Perl Beginners
Subject: Regex help!


Please...  :)

Hello, I'm trying to match images by the extension.  Either .jpg or .gif

Here's what I wrote:

$plup =~ m/\(\^\.\.gif\|\^\.\.jpg)/\

Is it all screwed up?  I'm not real good with regexes.

Thanks if you can help me!

Troy

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help

2001-08-22 Thread Jeff 'japhy/Marillion' Pinyan


On Aug 22, Michel Blanc said:

>Does anyone could help in mathing one of the following letters
>
>  cCdeEfGhiI
>
>any number of times, but if a letter has already matched, it can repeat
>again in the string.
>
>cCdE   : match
>cCcE   : doesn't match

I think you mean "if a letter has already matched, it CAN'T repeat again
in the string" -- meaning, each character must be unique.

It's not a simple task without a complex regex assertion, (??{ ... }).

I can provide a solution, but I cannot guarantee it will be easy to
understand once explained.

-- 
Jeff "japhy" Pinyan  [EMAIL PROTECTED]  http://www.pobox.com/~japhy/
RPI Acacia brother #734   http://www.perlmonks.org/   http://www.cpan.org/
** Look for "Regular Expressions in Perl" published by Manning, in 2002 **


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help

2001-08-22 Thread Michel Blanc


Jeff 'japhy/Marillion' Pinyan a écrit :
> 

> I think you mean "if a letter has already matched, it CAN'T repeat again
> in the string" -- meaning, each character must be unique.

Yes, that's right.

> It's not a simple task without a complex regex assertion, (??{ ... }).
> 
> I can provide a solution, but I cannot guarantee it will be easy to
> understand once explained.

I already spend a lot of energy on your JAPHs :)
Thank you for spending time on this.
When you'll send me your solution, I'll come back and tell you if I
understand what you did !

Michel.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help

2001-08-22 Thread Bob Showalter


> -Original Message-
> From: Michel Blanc [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, August 22, 2001 10:58 PM
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: Re: Regex help
> 
> 
> Jeff 'japhy/Marillion' Pinyan a écrit :
> > 
> 
> > I think you mean "if a letter has already matched, it CAN'T 
> repeat again
> > in the string" -- meaning, each character must be unique.
> 
> Yes, that's right.
> 
> > It's not a simple task without a complex regex assertion, 
> (??{ ... }).
> > 
> > I can provide a solution, but I cannot guarantee it will be easy to
> > understand once explained.
> 
> I already spend a lot of energy on your JAPHs :)
> Thank you for spending time on this.
> When you'll send me your solution, I'll come back and tell you if I
> understand what you did !

If you want to match any of those characters (and no other) in any order, 
but at most once, here is a non-regex approach (not terribly efficient
if you need to do it millions of times, but it works):

 use strict;

 my $key = 'cCdeEfGhiI'; # legal chars

 check('cCdE');  # match
 check('cCcE');  # no match
 check('xccE');  # no match

 sub check
 {
my $val = shift;

print "Testing $val: ";
my %h = map { ($_ => 0) } split //, $key;
for (split //, $val)
{
print("No match\n"), return
unless exists $h{$_} && !$h{$_}++;
}
print "Match\n";
 }

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help

2001-08-22 Thread Gibbs Tanton - tgibbs


 So in essence, you don't want to match if a character repeats

sub check {
  return 1 if $_[0] !~ /(.).*\1/;
  return 0;
}

print check ('cCdE'), "\n"; # prints 1
print check ('cCcE'), "\n"; # prints 0
print check ('cEdE'), "\n"; # prints 0

Good luck
Tanton

-Original Message-
From: Michel Blanc
To: [EMAIL PROTECTED]
Sent: 8/22/2001 12:12 AM
Subject: Regex help

Dear All,

Does anyone could help in mathing one of the following letters

cCdeEfGhiI

any number of times, but if a letter has already matched, it can repeat
again in the string.

For instance :

cCdE: match
cCcE: doesn't match

Any clue anyone ?

Thanks,

Michel

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help

2001-08-22 Thread Jeff 'japhy/Marillion' Pinyan


On Aug 22, Michel Blanc said:

>Jeff 'japhy/Marillion' Pinyan a écrit :
>> 
>
>> I think you mean "if a letter has already matched, it CAN'T repeat again
>> in the string" -- meaning, each character must be unique.
>
>Yes, that's right.
>
>> It's not a simple task without a complex regex assertion, (??{ ... }).
>> 
>> I can provide a solution, but I cannot guarantee it will be easy to
>> understand once explained.
>
>I already spend a lot of energy on your JAPHs :)
>Thank you for spending time on this.
>When you'll send me your solution, I'll come back and tell you if I
>understand what you did !

Someone else has already shown the general approach for ensuring a
unique-character string:

  sub unique_characters {
$_[0] =~ /(.).*?\1/s ? 0 : 1
  }

We can extrapolate upon this idea and test afterward that the string
doesn't contain any unwanted characters:

  sub unique_char_set {
my ($str, $chars) = @_;
return 0 if $str =~ /[^\Q$chars\E]/;
return !($str =~ /(.).*?\1/s);
  }

This returns false if there is an invalid character, or if there is a
doubled character.  Otherwise, it returns true.

You can do this using the (??{ ... }) assertion, like I suggested.  It is
more complex, and I'm not sure how it compares speed-wise.  And, when I
think of it, it is totally improper.

-- 
Jeff "japhy" Pinyan  [EMAIL PROTECTED]  http://www.pobox.com/~japhy/
RPI Acacia brother #734   http://www.perlmonks.org/   http://www.cpan.org/
** Look for "Regular Expressions in Perl" published by Manning, in 2002 **


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help

2001-08-23 Thread Michel Blanc


Jeff 'japhy/Marillion' Pinyan a écrit :

> Someone else has already shown the general approach for ensuring a
> unique-character string:
> 
>   sub unique_characters {
> $_[0] =~ /(.).*?\1/s ? 0 : 1
>   }
> 
> We can extrapolate upon this idea and test afterward that the string
> doesn't contain any unwanted characters:
> 
>   sub unique_char_set {
> my ($str, $chars) = @_;
> return 0 if $str =~ /[^\Q$chars\E]/;
> return !($str =~ /(.).*?\1/s);
>   }

Thanks for your response guys. This is very useful.

In fact, since I needed a one liner (I forgot to say that) for a
regex-based dispatch table, I tried to convert that to :


($str !~ /(c|C|d|e|E|f|G|h|i|I|k|K|l|L|m|M|r|R|s|t|T|v|V|x|X).*?\1/s);

But this doesn't fail if unwanted characters are in.

So I am afraid that we'll arrive at (??{ ... }) 
What I am trying to do is some kind of shell. Command dispatching is
done via a hash containing "regex" => coderef style data. That's why I
am looking for a one--liner.

Thanks ! 
Michel.
-- 
Michel Blanc
Centre Multimédia Erasme/Parc d'activités innovantes
69930 Saint Clément-les-places
Tel : +33-(0)4-74-70-68-40 / Fax : +33-(0)4-74-70-68-40

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help

2001-08-23 Thread Michel Blanc


Bob Showalter a écrit :
> 

> If you want to match any of those characters (and no other) in any order,
> but at most once, here is a non-regex approach (not terribly efficient
> if you need to do it millions of times, but it works):
> 
>  use strict;
> 
>  my $key = 'cCdeEfGhiI'; # legal chars
> 
>  check('cCdE');  # match
>  check('cCcE');  # no match
>  check('xccE');  # no match
> 
>  sub check
>  {
> my $val = shift;
> 
> print "Testing $val: ";
> my %h = map { ($_ => 0) } split //, $key;
> for (split //, $val)
> {
> print("No match\n"), return
> unless exists $h{$_} && !$h{$_}++;
> }
> print "Match\n";
>  }

Thanks for this Bob, but I forgot to say that :

- I need a regexp,
- I need a one-liner
- The order of characters doesn't matter

Sorry for this oversight, and thanks for your help.

Michel.
-- 
Michel Blanc
Centre Multimédia Erasme/Parc d'activités innovantes
69930 Saint Clément-les-places
Tel : +33-(0)4-74-70-68-40 / Fax : +33-(0)4-74-70-68-40

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Regex help

2001-08-23 Thread Gibbs Tanton - tgibbs


A one liner would be
$is_bad = $string =~ /[^cCdeEfGhiIkKlLmMrRstTvVxX]/ || $string =~ /(.).*\1/;
naturally...if you need only one regex then you could say
$is_bad = $string =~ /[^cCdeEfGhiIkKlLmMrRstTvVxX]|((.).*\2)/;

I'm not sure if the character class is right for your app (I don't know the
exact letters...just copied from below...but you can change that.

Good Luck!
Tanton

-Original Message-
From: Michel Blanc
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: 8/23/2001 9:11 AM
Subject: Re: Regex help

Jeff 'japhy/Marillion' Pinyan a écrit :

> Someone else has already shown the general approach for ensuring a
> unique-character string:
> 
>   sub unique_characters {
> $_[0] =~ /(.).*?\1/s ? 0 : 1
>   }
> 
> We can extrapolate upon this idea and test afterward that the string
> doesn't contain any unwanted characters:
> 
>   sub unique_char_set {
> my ($str, $chars) = @_;
> return 0 if $str =~ /[^\Q$chars\E]/;
> return !($str =~ /(.).*?\1/s);
>   }

Thanks for your response guys. This is very useful.

In fact, since I needed a one liner (I forgot to say that) for a
regex-based dispatch table, I tried to convert that to :


($str !~ /(c|C|d|e|E|f|G|h|i|I|k|K|l|L|m|M|r|R|s|t|T|v|V|x|X).*?\1/s);

But this doesn't fail if unwanted characters are in.

So I am afraid that we'll arrive at (??{ ... }) 
What I am trying to do is some kind of shell. Command dispatching is
done via a hash containing "regex" => coderef style data. That's why I
am looking for a one--liner.

Thanks ! 
Michel.
-- 
Michel Blanc
Centre Multimédia Erasme/Parc d'activités innovantes
69930 Saint Clément-les-places
Tel : +33-(0)4-74-70-68-40 / Fax : +33-(0)4-74-70-68-40

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help

2001-08-23 Thread Jeff 'japhy/Marillion' Pinyan


On Aug 23, Michel Blanc said:

>>   sub unique_char_set {
>> my ($str, $chars) = @_;
>> return 0 if $str =~ /[^\Q$chars\E]/;
>> return !($str =~ /(.).*?\1/s);
>>   }
>
>Thanks for your response guys. This is very useful.
>
>In fact, since I needed a one liner (I forgot to say that) for a
>regex-based dispatch table, I tried to convert that to :

Ok.  Then use this:

  if ($str =~ /[^cCdeEfghiIkKlLmMrRstTvVxX]|(.).*?\1/) {
# it was a bad string
  }

-- 
Jeff "japhy" Pinyan  [EMAIL PROTECTED]  http://www.pobox.com/~japhy/
RPI Acacia brother #734   http://www.perlmonks.org/   http://www.cpan.org/
** Look for "Regular Expressions in Perl" published by Manning, in 2002 **


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: regex help

2001-11-13 Thread Bob Showalter


> -Original Message-
> From: A. Rivera [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, November 13, 2001 1:39 PM
> To: [EMAIL PROTECTED]
> Subject: regex help
> 
> 
> Ok,
> 
> I need help find the most effecient way to do this..
> 
> I have a variable...
> $data="this is a test";
> 
> What is the quickest way to get $data to equal just the first 
> two words of
> the original variable

It depends on how you define a "word". If you mean any sequence
of non-whitespace, something like this should work:

  ($data) = $data =~ /(\S+\s+\S+)/;

\S+ matches first "word"
\s+ matches whitespace between words
\S+ matches second "word"

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: regex help

2001-11-13 Thread Brett W. McCoy


On Tue, 13 Nov 2001, A. Rivera wrote:


> I need help find the most effecient way to do this..
>
> I have a variable...
> $data="this is a test";
>
> What is the quickest way to get $data to equal just the first two words of
> the original variable

This splits the string up, extracts the first two words, and joins them
again and re-assigns to $data:

$data = join(" ", (split(/\s/, $data))[0..1]);

-- Brett
  http://www.chapelperilous.net/

Everything should be built top-down, except this time.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: regex help

2001-11-15 Thread Piers Cawley

"Brett W. McCoy" <[EMAIL PROTECTED]> writes:

> On Tue, 13 Nov 2001, A. Rivera wrote:
> 
> 
>> I need help find the most effecient way to do this..
>>
>> I have a variable...
>> $data="this is a test";
>>
>> What is the quickest way to get $data to equal just the first two words of
>> the original variable
> 
> This splits the string up, extracts the first two words, and joins them
> again and re-assigns to $data:
> 
> $data = join(" ", (split(/\s/, $data))[0..1]);

You may find that
  $data = join('', (split /(\s+)/, $data)[0..2]);

Is a little more tolerant of multiple white space than Brett's answer.
Note the trick we use with split to capture the 'actual' whitespace
used to seperate the words by putting the split pattern in brackets.

You could also make the change by doing:

  $data =~ s/((?:(?:^|\s+)\S+){2}).*/$1/;

Which has the advantage that you can easily change the number of words
you match simply by changing the value in the braces. And if you want
to catch at most 2 words, you'd have {0,2} in there...

I'm not sure which is the fastest; I've not benchmarked it, but it's
generally more important to worry about which is the *clearest*.
Programmer time is far more valuable than processor time.

So if you are sure that your data will never contain more than one
space between words, go with Brett's solution. If it might have more
than one space between words and you don't mind replacing them with a
single space, go with Brett's solution but replace \s with \s+ in the
split pattern.

If you want to be flexible about data, then go with my solution, but
wrap it in a function like so:

sub truncate_to_n_words {
my($string, @bounds) = @_;
croak "Too many bounds" unless scalar @bounds <= 2;
croak "Not enough bounds" unless scalar @bounds;
local $" = ','; # makes "@bounds" seperate terms with a comma
$string =~ s{((?: # replace
  (?:^|\s+)   # line start or any number of spaces
  \S+ # followed by some none-white chars
  )   # Match this group
  {@bounds}   # between $bounds[0] and $bounds[1] times
 )# And remember it.
 .*   # Catch the remaining chars.
}{$1}x;   # and throw them away.

$_[0] = $string;  # Modify the original string in place.
}

sub truncate_to_2_words {
truncate_to_n_words($_[0], 2);
}

The idea being that, yes, the regular expression is ugly, but that
ugliness is hidden away behind a well named function. The code where
you need the behaviour will then look like:

truncate_to_2_words($string);

Which is substantially clearer than any of the one line solutions.

Of course, it's slower to run and took longer to write, but every time
you revisit code that makes use of it you'll not have to work out
what's going on.

-- 
Piers

   "It is a truth universally acknowledged that a language in
possession of a rich syntax must be in need of a rewrite."
 -- Jane Austen?

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help

2001-12-07 Thread Frank

On Fri, Dec 07, 2001 at 02:09:25PM +0100, Jorge wrote:
> I have this line in a file :
> host clin09 {
> hardware ethernet 00:80:9F:2E:3F:5E

Is this all on one line or is it two, it looks like two here.

if it's two either set $/=undef; to slurp all the lines in the file 
into one scalar variable ($) or set a flag to say I've got host, next
line should have /hardware ethernet/ in it.

A regex you could use is:

/hardware ethernet / && $MAC_ADR= $';

this checks to match the contents of the line to "hardware ethernet "
and sets MAC_ADR to the remainder of the line after the match.

This has an overhead since using $&,$` or $' in one regex means they're
populated on ALL subsequent regexes, also true of $1..$9 an alternative
is to use negative lookbehinds or even just plain old:

$line =~s/hardware ethernet //;$MAC_ADR=$line;
-- 
 Frank Booth - Consultant
Parasol Solutions Limited.
(www.parasolsolutions.com)

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Regex help

2008-06-20 Thread yitzle

On Fri, Jun 20, 2008 at 3:10 PM, Ravi Malghan <[EMAIL PROTECTED]> wrote:
> Hi: I am trying to extract some stuff from a string and not getting the 
> expected results. I have looked through 
> http://www.perl.com/doc/manual/html/pod/perlre.html and can't seem to figure 
> this one out.
> I have a string which is a sequence of words and each item is comma seperated
> field1, lengthof value1, value1,field2, length of value2, 
> value2,field3,length of value3, value3 and so on
> After each field name I have the length of the value
> I want to split this string into an array using comma seperator, but the 
> problem is some values have one or more commas within them.
> so for example my string might look like this
> $origString = "EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note, 
> with some commas, and more commas,ADDR,3515421 Test Lane, Rockville, MD, 
> USA,ESCALATION-LVL,1,0"
> My current code goes character by character and constructs what I want. But 
> is very slow when this string gets large.
> TIA
> Ravi

My solution:

use strict;
use warnings;

my $origString = "EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a
test note, with some commas, and more commas,ADDR,3515421 Test Lane,
Rockville, MD, USA,ESCALATION-LVL,1,0";

my @arr = split (/,/, $origString);
# print join ("\n", @arr); exit;

while ( scalar @arr ) {

my $field = shift @arr;
last unless ( defined $field );

my $vlength = shift @arr;
last unless ( defined $vlength );
unless ( $vlength =~ /^\d+$/ ) {
die "Invalid length: [$vlength]\n";
}

my $value = "";
while ( length ( $value ) < $vlength ) {
my $bit = shift @arr;
last unless ( defined $bit );

$value .= "," if ( length $value );
$value .= "$bit";
}
print "$field -> $value\n";
}

Time it?

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2008-06-20 Thread John W. Krahn


Ravi Malghan wrote:

Hi: I am trying to extract some stuff from a string and not getting the 
expected results. I have looked through 
http://www.perl.com/doc/manual/html/pod/perlre.html and can't seem to figure 
this one out.
I have a string which is a sequence of words and each item is comma seperated
field1, lengthof value1, value1,field2, length of value2, value2,field3,length 
of value3, value3 and so on
After each field name I have the length of the value
I want to split this string into an array using comma seperator, but the 
problem is some values have one or more commas within them.
so for example my string might look like this
$origString = "EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note, with some 
commas, and more commas,ADDR,3515421 Test Lane, Rockville, MD, 
USA,ESCALATION-LVL,1,0"
My current code goes character by character and constructs what I want. But is 
very slow when this string gets large.


$ perl -le'
my $origString = "EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test 
note, with some commas, and more commas,ADDR,3515421 Test Lane, 
Rockville, MD, USA,ESCALATION-LVL,1,0";


while ( $origString =~ /([^,]+),(\d+),/g ) {
print for $1, $2, substr $origString, pos( $origString ), $2;
}
'
EMPLID
4
9066
USERID
7
W3LWEB1
TEXT
54
This is a test note, with some commas, and more commas
ESCALATION-LVL
1
0



John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2008-06-20 Thread Rob Dixon

Ravi Malghan wrote:
> 
> Hi: I am trying to extract some stuff from a string and not getting the 
> expected results. I have looked through 
> http://www.perl.com/doc/manual/html/pod/perlre.html and can't seem to figure 
> this one out.
> 
> I have a string which is a sequence of words and each item is comma seperated
> field1, lengthof value1, value1,field2, length of value2, 
> value2,field3,length of value3, value3 and so on
> 
> After each field name I have the length of the value
> 
> I want to split this string into an array using comma seperator, but the 
> problem is some values have one or more commas within them.
> 
> so for example my string might look like this
> 
> $origString = "EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note, 
> with some commas, and more commas,ADDR,3515421 Test Lane, Rockville, MD, 
> USA,ESCALATION-LVL,1,0"
> 
> My current code goes character by character and constructs what I want. But
> is very slow when this string gets large.

The program below will do what you describe.

HTH,

Rob


use strict;
use warnings;

my $origString = "EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note,
with some commas, and more commas,ADDR,3515421 Test Lane, Rockville, MD,
USA,ESCALATION-LVL,1,0";

while() {

  $origString =~ /\G([^,]+),/g or last;
  my $field = $1;

  $origString =~ /\G(\d+),/g or last;
  my $size = $1;

  $origString =~ /\G(.{$size}),?/g or last;
  my $value = $1;

  printf "%s(%d) - %s\n", $field, $size, $value;
}

**OUTPUT**

EMPLID(4) - 9066
USERID(7) - W3LWEB1
TEXT(54) - This is a test note, with some commas, and more commas



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2008-06-20 Thread Gunnar Hjalmarsson


Ravi Malghan wrote:
I have a string which is a sequence of words and each item is comma 
seperated
field1, lengthof value1, value1,field2, length of value2, 
value2,field3,length of value3, value3 and so on

After each field name I have the length of the value
I want to split this string into an array using comma seperator, but 
the problem is some values have one or more commas within them.


Okay. There is a missing comma between "ADDR,35" and "15421", right? 
Under that assumption, I believe this code gets what you want:


C:\home>type test.pl
my $origString = "EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,"
. "This is a test note, with some commas, and more commas,"
. "ADDR,35,15421 Test Lane, Rockville, MD, USA,ESCALATION-LVL,1,0";

my @parts = split /([A-Z-]+),(\d+)/, $origString;
shift @parts;
while ( my $k = shift @parts ) {
my $length = shift @parts;
print "$k => ", substr( shift @parts, 1, $length ), "\n";
}

C:\home>test.pl
EMPLID => 9066
USERID => W3LWEB1
TEXT => This is a test note, with some commas, and more commas
ADDR => 15421 Test Lane, Rockville, MD, USA
ESCALATION-LVL => 0

C:\home>

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2008-06-20 Thread icarus

On Jun 20, 9:10 am, [EMAIL PROTECTED] (Ravi Malghan) wrote:
> Hi: I am trying to extract some stuff from a string and not getting the 
> expected results. I have looked 
> throughhttp://www.perl.com/doc/manual/html/pod/perlre.html and can't seem to 
> figure this one out.
> I have a string which is a sequence of words and each item is comma seperated
> field1, lengthof value1, value1,field2, length of value2, 
> value2,field3,length of value3, value3 and so on
> After each field name I have the length of the value
> I want to split this string into an array using comma seperator, but the 
> problem is some values have one or more commas within them.
> so for example my string might look like this
> $origString = "EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note, 
> with some commas, and more commas,ADDR,3515421 Test Lane, Rockville, MD, 
> USA,ESCALATION-LVL,1,0"
> My current code goes character by character and constructs what I want. But 
> is very slow when this string gets large.
> TIA
> Ravi



> I want to split this string into an array using comma seperator, but the 
> problem is some values have one or more commas within them [..]

> My current code goes character by character and constructs what I want. But 
> is very slow when this string gets large.
Post your code or relevant portion.  Otherwise we might
repeating here stuff what you've done or tried already.

   Is there any way you can use another delimiter such as
tildes ~ or something? If you tweak to accept other delimiters that
would be easier to treat.  If you cannot, you could use regex to find
the next alpha_num character of the string and put those into an
array,

 \w  Match a "word" character (alphanumeric plus "_")
 \W  Match a non-word character
\b  Match a word boundary
\B  Match a non-(word boundary)

 or find out exactly the number of commas it may have and weed them
out...
 *  Match 0 or more times
+  Match 1 or more times
?  Match 1 or 0 times
{n}Match exactly n times
{n,}   Match at least n times
{n,m}  Match at least n but not more than m times
   etc..

 But again, post your code so we don't overlap...


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2008-06-21 Thread Dr.Ruud

Ravi Malghan schreef:

> I want to split this string into an array using comma seperator, but
> the problem is some values have one or more commas within them.

That is a common problem. First split on comma, then recombine elements
by using out-of-band knowledge.

-- 
Affijn, Ruud

"Gewoon is een tijger."


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2008-06-21 Thread Rob Dixon

Rob Dixon wrote:
> Ravi Malghan wrote:
>> Hi: I am trying to extract some stuff from a string and not getting the 
>> expected results. I have looked through 
>> http://www.perl.com/doc/manual/html/pod/perlre.html and can't seem to figure 
>> this one out.
>>
>> I have a string which is a sequence of words and each item is comma seperated
>> field1, lengthof value1, value1,field2, length of value2, 
>> value2,field3,length of value3, value3 and so on
>>
>> After each field name I have the length of the value
>>
>> I want to split this string into an array using comma seperator, but the 
>> problem is some values have one or more commas within them.
>>
>> so for example my string might look like this
>>
>> $origString = "EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note, 
>> with some commas, and more commas,ADDR,3515421 Test Lane, Rockville, MD, 
>> USA,ESCALATION-LVL,1,0"
>>
>> My current code goes character by character and constructs what I want. But
>> is very slow when this string gets large.
> 
> The program below will do what you describe.

Here's an improvement that explains when it doesn't find values that it expects.

Rob


use strict;
use warnings;

my $origString = "EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note,
with some commas, and more commas,ADDR,35,15421 Test Lane, Rockville, MD,
USA,ESCALATION-LVL,1,0";

while() {

  $origString =~ /\G([^,]+),(\d+),/g or die "No field name / size found";
  my ($field, $size) = ($1, $2);

  $origString =~ /\G(.{$size})/g or die "Insufficient characters for field 
size";
  my $value = $1;

  printf "%s (%d) - %s\n", $field, $size, $value;

  $origString =~ /\G,/g or last;
}

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2008-07-24 Thread Mr. Shawn H. Corey

On Thu, 2008-07-24 at 09:44 -0400, Tony Heal wrote:
> I have a text dump of a postgresql database and I want to find out if there
> are any characters that are not standard keyboard characters. Is there a way
> to use regex to do this without doing a character by character scan of a 5GB
> file. 

No.

> I want to know where any character that is not one of these is in the
> file: a-z A-Z 0-9 [EMAIL PROTECTED]  PROTECTED]&*()[]{};:'",.<>/?|\>
> &*()[]{};:'",.<>/?|\


See `perldoc POSIX` and search for 'isalpha'


-- 
Just my 0.0002 million dollars worth,
  Shawn

"Where there's duct tape, there's hope."

"Perl is the duct tape of the Internet."
Hassan Schroeder, Sun's first webmaster


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2011-05-16 Thread Jim Gibson

On 5/16/11 Mon  May 16, 2011  3:44 PM, "Owen"  scribbled:

> I am trying to get all the 6 letter names in the second field in DATA
> below, eg
> 
> BARTON
> DARWIN
> DARWIN
> 
> But the script below gives me all 6 letter and more entries.
> 
> What I read says {6} means exactly 6.

\S{6} will match any string containing 6 consecutive non-whitespace
characters. It will also match any string containing more than 6 such
characters, because any such string contains within it a substring of
exactly six characters. Perl matches do not have to match the entire string.

> 
> What is the correct RE?

If you want exactly six characters, then you need to specify that any
characters before or after the wanted six are not also members of the
desired class. In your case, the easiest way is to anchor the match at the
beginning and the end:

$line[1] =~ /^\S{6}$/

If you were looking for word characters, e.g. \w, you could use the word
boundary assertion metasymbol \b:

$line[1] =~ /\b\w{6}\b/

That will not work if your names contain punctuation characters, e.g
O'Reilly. More complex matches can use the negative lookahead and lookbehind
constructs.

> 
> I have solved the problem my using if (length($data[1]) == 6 ) but
> would love to know the correct syntax for the RE
> 
> 
> TIA
> 
> 
> Owen
> 
> 
> =
> 
> #!/usr/bin/perl
> 
> use strict;
> use warnings;
> 
> while () {
> my $line = $_;
> 
> my @line = split /,/;
> $line[1] =~ s /\"//g;
> 
> print "$line[1]\n" if $line[1] =~ /\S{6}/;
> }
> 
> __DATA__
> "0200","AUSTRALIAN NATIONAL UNIVERSITY","ACT","PO Boxes"
> "0221","BARTON","ACT","LVR Special Mailing"
> "0800","DARWIN","NT",,"DARWIN DELIVERY CENTRE"
> "0801","DARWIN","NT","GPO Boxes","DARWIN GPO DELIVERY ANNEXE"
> "0804","PARAP","NT","PO Boxes","PARAP LPO"
> "0810","ALAWA","NT",,"DARWIN DELIVERY CENTRE"
> "0810","BRINKIN","NT",,"DARWIN DELIVERY CENTRE"
> "0810","CASUARINA","NT",,"DARWIN DELIVERY CENTRE"
> "0810","COCONUT GROVE","NT",,"DARWIN DELIVERY CENTRE"
> 
> ===



-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: Regex help

2011-05-17 Thread Rob Dixon


On 16/05/2011 23:44, Owen wrote:


I am trying to get all the 6 letter names in the second field in DATA
below, eg

BARTON
DARWIN
DARWIN

But the script below gives me all 6 letter and more entries.

What I read says {6} means exactly 6.

What is the correct RE?

I have solved the problem my using if (length($data[1]) == 6 ) but
would love to know the correct syntax for the RE


=

#!/usr/bin/perl

use strict;
use warnings;

while () {
 my $line = $_;

 my @line = split /,/;
 $line[1] =~ s /\"//g;

 print "$line[1]\n" if $line[1] =~ /\S{6}/;
}

__DATA__
"0200","AUSTRALIAN NATIONAL UNIVERSITY","ACT","PO Boxes"
"0221","BARTON","ACT","LVR Special Mailing"
"0800","DARWIN","NT",,"DARWIN DELIVERY CENTRE"
"0801","DARWIN","NT","GPO Boxes","DARWIN GPO DELIVERY ANNEXE"
"0804","PARAP","NT","PO Boxes","PARAP LPO"
"0810","ALAWA","NT",,"DARWIN DELIVERY CENTRE"
"0810","BRINKIN","NT",,"DARWIN DELIVERY CENTRE"
"0810","CASUARINA","NT",,"DARWIN DELIVERY CENTRE"
"0810","COCONUT GROVE","NT",,"DARWIN DELIVERY CENTRE"

===


Hi Owen.

Your test establishes only whether the pattern can be found within the
object string a test like

"CASUARINA" =~ /\S{6}/;

finds the six non-space characters "CASUAR" and then returns success as
the criterion has been satisfied.

To get it to match /only/ six-character non-space strings you can add
anchors at the beginning and end of the regex:

"CASUARINA" =~ /^\S{6}$/;

will fail because the sequence "beginning of line, six non-space
characters, end of line" don't appear in "CASUARINA".

But the proper way to do this is to forget about regular expressions and
treat the data as comma-separated fields. The module Text::CSV will do
this for you, as per the progrm below.

HTH,

Rob


use strict;
use warnings;

use Text::CSV;

my $csv = Text::CSV->new;

while (my $fields = $csv->getline(*DATA)) {
  my $suburb = $fields->[1];
  next unless $suburb and length $suburb == 6;
  print $suburb, "\n";
}

__DATA__
"0200","AUSTRALIAN NATIONAL UNIVERSITY","ACT","PO Boxes"
"0221","BARTON","ACT","LVR Special Mailing"
"0800","DARWIN","NT",,"DARWIN DELIVERY CENTRE"
"0801","DARWIN","NT","GPO Boxes","DARWIN GPO DELIVERY ANNEXE"
"0804","PARAP","NT","PO Boxes","PARAP LPO"
"0810","ALAWA","NT",,"DARWIN DELIVERY CENTRE"
"0810","BRINKIN","NT",,"DARWIN DELIVERY CENTRE"
"0810","CASUARINA","NT",,"DARWIN DELIVERY CENTRE"
"0810","COCONUT GROVE","NT",,"DARWIN DELIVERY CENTRE"

**OUTPUT**

BARTON
DARWIN
DARWIN


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex help

2007-08-02 Thread Chas Owens

On 8/2/07, Tony Heal <[EMAIL PROTECTED]> wrote:
snip
> Why doesn't this work? I want to take any leading or trailing white spaces 
> out.
> If I remove the remark it works, but I
> do not understand why it requires the second line
> $string =~ s/^(\s+)(.*)(\s+)$/$2/;
snip

Because (.*) matches all but the one space needed by the second (\s+).
 The . matches everything including the spaces.  You can fix this by
saying

$string =~ s/^(\s+)(.*?)(\s+)$/$2/;

to make (.*) match the smallest pattern (non-greedy) instead of the
largest (greedy).

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-02 Thread Ricky Zhou

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tony Heal wrote:
> Why doesn't this work? I want to take any leading or trailing white spaces 
> out. If I remove the remark it works, but I
> do not understand why it requires the second line
For reference, perldoc perlre and search for greedy.

Basically, the .* matches as much as possible, so it gets the spaces as
well.  To make it not greedy, you add a ?, so
$string =~ s/^\s+(.*?)\s+$/$1/;
would work.

Hope this helps,
Ricky
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFGskSdZBKKLMyvSE4RAmoLAJ9FPUqm+9utecURkec0gMWItfKEYACgmpeS
lf1qanHZefDeV5z87LMusWo=
=8U17
-END PGP SIGNATURE-

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-02 Thread Tony Heal

So since '?' will match the last character, group, or class 0 or 1 time the it 
matches the group of whatever happens to
be in '.*' up to any spaces that are attached to the '$'.

Is that correct?

Tony Heal


> -Original Message-
> From: Chas Owens [mailto:[EMAIL PROTECTED]
> Sent: Thursday, August 02, 2007 4:55 PM
> To: [EMAIL PROTECTED]
> Cc: beginners@perl.org
> Subject: Re: regex help
> 
> On 8/2/07, Tony Heal <[EMAIL PROTECTED]> wrote:
> snip
> > Why doesn't this work? I want to take any leading or trailing white spaces 
> > out.
> > If I remove the remark it works, but I
> > do not understand why it requires the second line
> > $string =~ s/^(\s+)(.*)(\s+)$/$2/;
> snip
> 
> Because (.*) matches all but the one space needed by the second (\s+).
>  The . matches everything including the spaces.  You can fix this by
> saying
> 
> $string =~ s/^(\s+)(.*?)(\s+)$/$2/;
> 
> to make (.*) match the smallest pattern (non-greedy) instead of the
> largest (greedy).


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-02 Thread Chas Owens

On 8/2/07, Tony Heal <[EMAIL PROTECTED]> wrote:
> So since '?' will match the last character, group, or class 0 or 1 time the 
> it matches the group of whatever happens to
> be in '.*' up to any spaces that are attached to the '$'.
>
> Is that correct?
snip

No, the ? in .*? is not the same as the ? in [abc]?  just like neither
of them are the same as the ? in (?foo)  The character is being
reused, but the meanings are completely separate.  The ? character
when used with a quantifier (i.e. *, +, ?, {n}, or {n,m}) means "match
the smallest possible string" (non-greedy).  The default for those
modifiers is to match the largest string possible (greedy).

from perldoc perlre:
   The following standard quantifiers are recognized:

   *  Match 0 or more times
   +  Match 1 or more times
   ?  Match 1 or 0 times
   {n}Match exactly n times
   {n,}   Match at least n times
   {n,m}  Match at least n but not more than m times
snip
   By default, a quantified subpattern is "greedy", that is, it will match
   as many times as possible (given a particular starting location) while
   still allowing the rest of the pattern to match.  If you want it to
   match the minimum number of times possible, follow the quantifier with
   a "?".  Note that the meanings don't change, just the "greediness":

   *? Match 0 or more times
   +? Match 1 or more times
   ?? Match 0 or 1 time
   {n}?   Match exactly n times
   {n,}?  Match at least n times
   {n,m}? Match at least n but not more than m times

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-02 Thread Jeff Pang



-Original Message-
>From: "John W. Krahn" <[EMAIL PROTECTED]>
>Sent: Aug 3, 2007 10:37 AM
>To: Perl beginners 
>Subject: Re: regex help
>
>Tony Heal wrote:
>> Why doesn't this work? I want to take any leading or trailing white spaces 
>> out.
>
>perldoc -q "How do I strip blank space"
>
>

Or generally it could be done by,
$string =~ s/^\s+|\s+$//g;

--
Jeff Pang <[EMAIL PROTECTED]>
http://home.arcor.de/jeffpang/

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-02 Thread John W. Krahn


Tony Heal wrote:

Why doesn't this work? I want to take any leading or trailing white spaces out.


perldoc -q "How do I strip blank space"


John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-08 Thread Dr.Ruud

Jeff Pang schreef:
> John W. Krahn:
>> Tony Heal:

>>> Why doesn't this work? I want to take any leading 
>>> or trailing white spaces out. 
>> 
>> perldoc -q "How do I strip blank space"
> 
> Or generally it could be done by,
> $string =~ s/^\s+|\s+$//g;

The g-modifier doesn't mean "generally" nor "good". ;-) 
Please see the suggested perldoc text for the proper ways. 

I like to use:

  s/^\s+//, s/\s+$// for $string;

but

  $string =~ s/^\s+//;
  $string =~ s/\s+$//;

may be slightly faster.
(like because no localization of $_) 

-- 
Affijn, Ruud

"Gewoon is een tijger."

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-08 Thread Dan Sopher

This works in a one-liner:

$string =~ s/^\s*(.*\S)\s*$/$1/;

Cheers!

-Dan



-Original Message-
From: Dr.Ruud [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, August 08, 2007 2:05 PM
To: beginners@perl.org
Subject: Re: regex help

Jeff Pang schreef:
> John W. Krahn:
>> Tony Heal:

>>> Why doesn't this work? I want to take any leading 
>>> or trailing white spaces out. 
>> 
>> perldoc -q "How do I strip blank space"
> 
> Or generally it could be done by,
> $string =~ s/^\s+|\s+$//g;

The g-modifier doesn't mean "generally" nor "good". ;-) 
Please see the suggested perldoc text for the proper ways. 

I like to use:

  s/^\s+//, s/\s+$// for $string;

but

  $string =~ s/^\s+//;
  $string =~ s/\s+$//;

may be slightly faster.
(like because no localization of $_) 

-- 
Affijn, Ruud

"Gewoon is een tijger."

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-08 Thread John W. Krahn


Dan Sopher wrote:

This works in a one-liner:

$string =~ s/^\s*(.*\S)\s*$/$1/;

Cheers!


Let's compare Dan's one-liner to the solutions in the FAQ (perlfaq4):

$ perl -le'
for ( "\nX\n", "\nX", "X\n", "X", "\n\n\n", "\n", "" ) {
$a = $b = $c = $_;

$d  = $a =~ s/^\s*(.*\S)\s*$/$1/;

$e  = $b =~ s/^\s+//;
$e += $b =~ s/\s+$//;

$f  = $c =~ s/^\s+|\s+$//g;

print "Test: ", ++$g," Length of original: ", length( $_ ), "\n",
  "Dan\047s length: ", length( $a ), " on a string that was", $d ? "" 
: " NOT", " modified.\n",
  "FAQ 1 length: ",length( $b ), " on a string that was", $e ? "" 
: " NOT", " modified.\n",
  "FAQ 2 length: ",length( $c ), " on a string that was", $f ? "" 
: " NOT", " modified.\n";

}
'
Test: 1 Length of original: 3
Dan's length: 1 on a string that was modified.
FAQ 1 length: 1 on a string that was modified.
FAQ 2 length: 1 on a string that was modified.

Test: 2 Length of original: 2
Dan's length: 1 on a string that was modified.
FAQ 1 length: 1 on a string that was modified.
FAQ 2 length: 1 on a string that was modified.

Test: 3 Length of original: 2
Dan's length: 1 on a string that was modified.
FAQ 1 length: 1 on a string that was modified.
FAQ 2 length: 1 on a string that was modified.

Test: 4 Length of original: 1
Dan's length: 1 on a string that was modified.
FAQ 1 length: 1 on a string that was NOT modified.
FAQ 2 length: 1 on a string that was NOT modified.

Test: 5 Length of original: 3
Dan's length: 3 on a string that was NOT modified.
FAQ 1 length: 0 on a string that was modified.
FAQ 2 length: 0 on a string that was modified.

Test: 6 Length of original: 1
Dan's length: 1 on a string that was NOT modified.
FAQ 1 length: 0 on a string that was modified.
FAQ 2 length: 0 on a string that was modified.

Test: 7 Length of original: 0
Dan's length: 0 on a string that was NOT modified.
FAQ 1 length: 0 on a string that was NOT modified.
FAQ 2 length: 0 on a string that was NOT modified.



John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-20 Thread Mr. Shawn H. Corey


Tony Heal wrote:

I have an array that will have these values. Each value is part of a file name. 
I need to keep the highest (numerically)
5 files and delete the rest.  What is the easiest to sort the array.


Break each file name into fields and sort by most significant field to least.  Use 
the Schwartzian Transform  
to sort.

See:
perldoc perlretut
perldoc perlre


--
Just my 0.0002 million dollars worth,
 Shawn

"For the things we have to learn before we can do them, we learn by doing them."
 Aristotle

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-20 Thread Jeff Pang



-Original Message-
>From: Tony Heal <[EMAIL PROTECTED]>
>Sent: Aug 21, 2007 5:50 AM
>To: beginners@perl.org
>Subject: regex help
>
>I have an array that will have these values. Each value is part of a file 
>name. I need to keep the highest (numerically)
>5 files and delete the rest.  What is the easiest to sort the array.
>

Well,it can be sorted but follow which field in the filename?the last numerical 
field?

Just show a way,

use strict;
use warnings;

my @arr = qw(14-special.4-32
14-special.4-32
14-special.4-33
14-special.4-33
15-special.1-51
15-special.1-51
15-special.1-52
15-special.1-52
15-special.1-52
15-special.1-53
15-special.1-53
15-special.1-53
15-special.1-54
15-special.1-54
15-special.3-44
15-special.3-44
15-special.3-45
15-special.3-45
15-special.4-4
15-special.4-4
15.2-100
15.2-100
15.2-104
15.2-104
15.2-124
15.2-124
15.2-65
15.2-65
15.2-66
15.2-66);

my @new = map { $_->[0] } sort { $b->[1] <=> $a->[1] } map { 
[$_,(split/-/)[-1]] } @arr;
print "@new[0..4]";


--
Jeff Pang - [EMAIL PROTECTED]
http://home.arcor.de/jeffpang/

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-20 Thread Mr. Shawn H. Corey


Jeff Pang wrote:

use strict;
use warnings;

my @arr = qw(14-special.4-32
14-special.4-32
14-special.4-33
14-special.4-33
15-special.1-51
15-special.1-51
15-special.1-52
15-special.1-52
15-special.1-52
15-special.1-53
15-special.1-53
15-special.1-53
15-special.1-54
15-special.1-54
15-special.3-44
15-special.3-44
15-special.3-45
15-special.3-45
15-special.4-4
15-special.4-4
15.2-100
15.2-100
15.2-104
15.2-104
15.2-124
15.2-124
15.2-65
15.2-65
15.2-66
15.2-66);

my @new = map { $_->[0] } sort { $b->[1] <=> $a->[1] } map { 
[$_,(split/-/)[-1]] } @arr;
print "@new[0..4]";


Fails; this would put '15-special.3-45' before '15-special.1-51'

As I said, separate the data into fields, based on your knowledge of how to do 
it.  (Nobody on this list knows how.)

Then you can sort.


--
Just my 0.0002 million dollars worth,
 Shawn

"For the things we have to learn before we can do them, we learn by doing them."
 Aristotle

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-20 Thread Jeff Pang



-Original Message-
>From: "Mr. Shawn H. Corey" <[EMAIL PROTECTED]>
>Sent: Aug 21, 2007 12:32 PM
>To: Jeff Pang <[EMAIL PROTECTED]>
>Cc: beginners@perl.org
>Subject: Re: regex help
>
>Jeff Pang wrote:
>> use strict;
>> use warnings;
>> 
>> my @arr = qw(14-special.4-32
>> 14-special.4-32
>> 14-special.4-33
>> 14-special.4-33
>> 15-special.1-51
>> 15-special.1-51
>> 15-special.1-52
>> 15-special.1-52
>> 15-special.1-52
>> 15-special.1-53
>> 15-special.1-53
>> 15-special.1-53
>> 15-special.1-54
>> 15-special.1-54
>> 15-special.3-44
>> 15-special.3-44
>> 15-special.3-45
>> 15-special.3-45
>> 15-special.4-4
>> 15-special.4-4
>> 15.2-100
>> 15.2-100
>> 15.2-104
>> 15.2-104
>> 15.2-124
>> 15.2-124
>> 15.2-65
>> 15.2-65
>> 15.2-66
>> 15.2-66);
>> 
>> my @new = map { $_->[0] } sort { $b->[1] <=> $a->[1] } map { 
>> [$_,(split/-/)[-1]] } @arr;
>> print "@new[0..4]";
>
>Fails; this would put '15-special.3-45' before '15-special.1-51'
>

Well,have you tested the codes then said this?
I sort it based on the last number field splited by '-'.It works fine for me.


--
Jeff Pang - [EMAIL PROTECTED]
http://home.arcor.de/jeffpang/

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread Mr. Shawn H. Corey

Jeff Pang wrote:

-Original Message-

From: "Mr. Shawn H. Corey" <[EMAIL PROTECTED]>
Sent: Aug 21, 2007 12:32 PM
To: Jeff Pang <[EMAIL PROTECTED]>
Cc: beginners@perl.org
Subject: Re: regex help

Jeff Pang wrote:

my @new = map { $_->[0] } sort { $b->[1] <=> $a->[1] } map { 
[$_,(split/-/)[-1]] } @arr;
print "@new[0..4]";

Fails; this would put '15-special.3-45' before '15-special.1-51'

Well,have you tested the codes then said this?
I sort it based on the last number field splited by '-'.It works fine for me.

The point is that only the OP can say what is significant.  And s/he hasn't.

--
Just my 0.0002 million dollars worth,
 Shawn

"For the things we have to learn before we can do them, we learn by doing them."
 Aristotle

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-21 Thread Tony Heal

the list is a list of files by version. I need to keep the last 5 versions.

Jeff's code works fine except I am getting some empty strings at the beginning 
that I have not figured out.

Here is what I have so far. Lines 34 and 39 are provide a print out for 
troubleshooting. Once I get this fixed all I
need to do is shift the top five from the list and unlink the rest.

#!/usr/bin/perl

use warnings;
use strict;

opendir (REPOSITORY, '/usr/local/repository/dists/');
my @repositories = readdir (REPOSITORY);
closedir (REPOSITORY);

my $packageRepo;
my @values;
my @newValues;
foreach (@repositories)
{
$packageRepo = $_;
chomp ($packageRepo);
opendir (packageREPO, 
"/usr/local/repository/dists/$packageRepo/non-free/binary-i386");
my @repoFiles = readdir (packageREPO);
close (packageREPO);
foreach (@repoFiles)
{
my $fileName = $_;
chomp ($fileName);
if ( /(.*)(([0-9][0-9])(-special)?\.([0-9])(-)([0-9]*))(.*)/)
{
push (@values, $2);
}
}
my %h;
foreach (@values) 
{
push (@newValues, $_) unless $h{$_}++
}
foreach (@newValues){print "$_\n";}
my @new = map { $_->[0] } 
sort { $b->[1] <=> $a->[1] } 
map { [$_,(split/-/)[-1]] } 
@newValues;
print "@new[0..4]\n";
}


Or for a line numbered version
http://rafb.net/p/asqgJo27.html

Tony Heal



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-21 Thread Tony Heal

Here is a sample of the versions that I am using.
16.1-17
16.1-22
16.1-23
16.1-39
16.3-1
16.3-6
16.3-7
16.3-8
16.3-15
16.5-1
16.5-2
16.5-10
16.5-13
15.3-12
15.2-108
14-special.1-2
14-special.1-8
14-special.1-15
14-special.2-40
14-special.2-41
14-special.3-4
14-special.3-7
14-special.3-12
15.2-110
15.2-111
15-special.1-52
15-special.1-53
15-special.1-54
16-special.4-9
16-special.4-10
16-special.5-1
16-special.5-2
16-special.6-6

Tony Heal
Pace Systems Group, Inc.
800-624-5999
[EMAIL PROTECTED]
 

> -Original Message-
> From: Tony Heal [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, August 21, 2007 8:42 AM
> To: beginners@perl.org
> Subject: RE: regex help
> 
> the list is a list of files by version. I need to keep the last 5 versions.
> 
> Jeff's code works fine except I am getting some empty strings at the 
> beginning that I have not
> figured out.
> 
> Here is what I have so far. Lines 34 and 39 are provide a print out for 
> troubleshooting. Once I get
> this fixed all I
> need to do is shift the top five from the list and unlink the rest.
> 
> #!/usr/bin/perl
> 
> use warnings;
> use strict;
> 
> opendir (REPOSITORY, '/usr/local/repository/dists/');
> my @repositories = readdir (REPOSITORY);
> closedir (REPOSITORY);
> 
> my $packageRepo;
> my @values;
> my @newValues;
> foreach (@repositories)
> {
>   $packageRepo = $_;
>   chomp ($packageRepo);
>   opendir (packageREPO, 
> "/usr/local/repository/dists/$packageRepo/non-free/binary-i386");
>   my @repoFiles = readdir (packageREPO);
>   close (packageREPO);
>   foreach (@repoFiles)
>   {
>   my $fileName = $_;
>   chomp ($fileName);
>   if ( /(.*)(([0-9][0-9])(-special)?\.([0-9])(-)([0-9]*))(.*)/)
>   {
>   push (@values, $2);
>   }
>   }
>   my %h;
>   foreach (@values)
>   {
>   push (@newValues, $_) unless $h{$_}++
>   }
> foreach (@newValues){print "$_\n";}
>   my @new = map { $_->[0] }
>   sort { $b->[1] <=> $a->[1] }
>   map { [$_,(split/-/)[-1]] }
>   @newValues;
>   print "@new[0..4]\n";
> }
> 
> 
> Or for a line numbered version
> http://rafb.net/p/asqgJo27.html
> 
> Tony Heal
> 
> 
> 
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> http://learn.perl.org/



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread Chas Owens

On 8/21/07, Tony Heal <[EMAIL PROTECTED]> wrote:
> Here is a sample of the versions that I am using.
snip

Just to clarify, you have a version string with the following format:

{major}{custom tag}.{minor}-{build}

and you want the list sorted by major, then minor, then build.

#!/usr/bin/perl

use strict;
use warnings;

my @versions;
while () {
chomp;
die "invalid format" unless
my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
push @versions, [ $major, $minor, $build , $_];
}

print "$_->[-1]\n" for sort {
$a->[0] <=> $b->[0] or
$a->[1] <=> $b->[1] or
$a->[2] <=> $b->[2]
} @versions;

__DATA__
16.1-17
16.1-22
16.1-23
16.1-39
16.3-1
16.3-6
16.3-7
16.3-8
16.3-15
16.5-1
16.5-2
16.5-10
16.5-13
15.3-12
15.2-108
14-special.1-2
14-special.1-8
14-special.1-15
14-special.2-40
14-special.2-41
14-special.3-4
14-special.3-7
14-special.3-12
15.2-110
15.2-111
15-special.1-52
15-special.1-53
15-special.1-54
16-special.4-9
16-special.4-10
16-special.5-1
16-special.5-2
16-special.6-6

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-21 Thread Jeff Pang



-Original Message-
>From: Tony Heal <[EMAIL PROTECTED]>
>Sent: Aug 21, 2007 9:25 PM
>To: [EMAIL PROTECTED], beginners@perl.org
>Subject: RE: regex help
>
>Here is a sample of the versions that I am using.
>16.1-17
>16.1-22
>16.1-23
>16.1-39
>16.3-1
>16.3-6
>16.3-7
>16.3-8
>16.3-15
>16.5-1
>16.5-2
>16.5-10
>16.5-13
>15.3-12
>15.2-108
>14-special.1-2
>14-special.1-8
>14-special.1-15
>14-special.2-40
>14-special.2-41
>14-special.3-4
>14-special.3-7
>14-special.3-12
>15.2-110
>15.2-111
>15-special.1-52
>15-special.1-53
>15-special.1-54
>16-special.4-9
>16-special.4-10
>16-special.5-1
>16-special.5-2
>16-special.6-6
>

Ok try this way.It sort the version from high to low and output the first 5.

use strict;
use warnings;

my @arr = qw(16.1-17
16.1-22
16.1-23
16.1-39
16.3-1
16.3-6
16.3-7
16.3-8
16.3-15
16.5-1
16.5-2
16.5-10
16.5-13
15.3-12
15.2-108
14-special.1-2
14-special.1-8
14-special.1-15
14-special.2-40
14-special.2-41
14-special.3-4
14-special.3-7
14-special.3-12
15.2-110
15.2-111
15-special.1-52
15-special.1-53
15-special.1-54
16-special.4-9
16-special.4-10
16-special.5-1
16-special.5-2
16-special.6-6
);

my @new = map { $_->[0] } sort { $b->[1] <=> $a->[1] or $b->[2] <=> $a->[2] or 
$b->[3] <=> $a->[3] } map { [ $_, split/\D+/ ] } @arr;
print "@new[0..4]";

__END__

Good luck!

--
Jeff Pang - [EMAIL PROTECTED]
http://home.arcor.de/jeffpang/

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread Chas Owens

On 8/21/07, Jeff Pang <[EMAIL PROTECTED]> wrote:
snip
> my @new = map { $_->[0] } sort { $b->[1] <=> $a->[1] or $b->[2] <=>
> $a->[2] or $b->[3] <=> $a->[3] } map { [ $_, split/\D+/ ] } @arr;
snip

While splitting on non-number is a nifty solution, it would break if
the custom tag can contain a number (16-custom2.2-14).  It is better
to nail down the version number scheme and write a regex that pulls
the required info from it that throws an error if a version does not
match the scheme.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread Chas Owens

On 8/21/07, Jeff Pang <[EMAIL PROTECTED]> wrote:
>
>
> -Original Message-
> >From: Chas Owens <[EMAIL PROTECTED]>
> >Sent: Aug 21, 2007 10:01 PM
> >To: Jeff Pang <[EMAIL PROTECTED]>
> >Cc: [EMAIL PROTECTED], beginners@perl.org
> >Subject: Re: regex help
> >
> >On 8/21/07, Jeff Pang <[EMAIL PROTECTED]> wrote:
> >snip
> >> my @new = map { $_->[0] } sort { $b->[1] <=> $a->[1] or $b->[2] <=>
> >> $a->[2] or $b->[3] <=> $a->[3] } map { [ $_, split/\D+/ ] } @arr;
> >snip
> >
> >While splitting on non-number is a nifty solution, it would break if
> >the custom tag can contain a number (16-custom2.2-14).  It is better
> >to nail down the version number scheme and write a regex that pulls
> >the required info from it that throws an error if a version does not
> >match the scheme.
>
> Have you seen this case on his datas?

I have seen a sampling of his data; if that is all of the data he has
then he can sort it by hand and doesn't need Perl.  Experience has
taught me to expect the worst from data.  You need to be able to
detect (if not recover from) malformed data and your split /\D/ will
just silently do the wrong thing (well, there might be some undef
warnings if the version were "12.4").  GIGO* is fine for custom
crafted one liners, but production quality code should at least make
an attempt to notice if the data is bad and signal the user/admin.

* Garbage In/Garbage Out

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread Jeff Pang



-Original Message-
>From: Chas Owens <[EMAIL PROTECTED]>
>Sent: Aug 21, 2007 10:01 PM
>To: Jeff Pang <[EMAIL PROTECTED]>
>Cc: [EMAIL PROTECTED], beginners@perl.org
>Subject: Re: regex help
>
>On 8/21/07, Jeff Pang <[EMAIL PROTECTED]> wrote:
>snip
>> my @new = map { $_->[0] } sort { $b->[1] <=> $a->[1] or $b->[2] <=>
>> $a->[2] or $b->[3] <=> $a->[3] } map { [ $_, split/\D+/ ] } @arr;
>snip
>
>While splitting on non-number is a nifty solution, it would break if
>the custom tag can contain a number (16-custom2.2-14).  It is better
>to nail down the version number scheme and write a regex that pulls
>the required info from it that throws an error if a version does not
>match the scheme.

Have you seen this case on his datas?

--
Jeff Pang - [EMAIL PROTECTED]
http://home.arcor.de/jeffpang/

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-21 Thread Tony Heal

OK I added this and I keep getting invalid format

foreach (@newValues){print "$_\n";}
my @versions;
while (@newValues) 
{
chomp;
die "invalid format" unless
my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
push @versions, [ $major, $minor, $build , $_];
}
foreach (@versions){print "$_\n";}
}

/tmp# ./trim.pl
14.20-33
14.20-34
14.18-29
14.18-33
14.18-34
14.18-35
14.18-37
14.20-27
14.20-28
14.20-29
14.20-30
14.20-31
14.20-32
14.16-30
14.16-31
invalid format at ./trim.pl line 41. (41 is the die line)


sorry Chas I first sent to you and not the list.

Tony Heal
Pace Systems Group, Inc.
800-624-5999
[EMAIL PROTECTED]
 

> -Original Message-
> From: Chas Owens [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, August 21, 2007 9:50 AM
> To: [EMAIL PROTECTED]
> Cc: beginners@perl.org
> Subject: Re: regex help
> 
> On 8/21/07, Tony Heal <[EMAIL PROTECTED]> wrote:
> > Here is a sample of the versions that I am using.
> snip
> 
> Just to clarify, you have a version string with the following format:
> 
> {major}{custom tag}.{minor}-{build}
> 
> and you want the list sorted by major, then minor, then build.
> 
> #!/usr/bin/perl
> 
> use strict;
> use warnings;
> 
> my @versions;
> while () {
> chomp;
> die "invalid format" unless
> my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
> push @versions, [ $major, $minor, $build , $_];
> }
> 
> print "$_->[-1]\n" for sort {
> $a->[0] <=> $b->[0] or
> $a->[1] <=> $b->[1] or
> $a->[2] <=> $b->[2]
> } @versions;
> 
> __DATA__
> 16.1-17
> 16.1-22
> 16.1-23
> 16.1-39
> 16.3-1
> 16.3-6
> 16.3-7
> 16.3-8
> 16.3-15
> 16.5-1
> 16.5-2
> 16.5-10
> 16.5-13
> 15.3-12
> 15.2-108
> 14-special.1-2
> 14-special.1-8
> 14-special.1-15
> 14-special.2-40
> 14-special.2-41
> 14-special.3-4
> 14-special.3-7
> 14-special.3-12
> 15.2-110
> 15.2-111
> 15-special.1-52
> 15-special.1-53
> 15-special.1-54
> 16-special.4-9
> 16-special.4-10
> 16-special.5-1
> 16-special.5-2
> 16-special.6-6


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-21 Thread Tony Heal

OK I added this and I keep getting invalid format

foreach (@newValues){print "$_\n";}
my @versions;
while (@newValues) 
{
chomp;
die "invalid format" unless
my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
push @versions, [ $major, $minor, $build , $_];
}
foreach (@versions){print "$_\n";}
}

/tmp# ./trim.pl
14.20-33
14.20-34
14.18-29
14.18-33
14.18-34
14.18-35
14.18-37
14.20-27
14.20-28
14.20-29
14.20-30
14.20-31
14.20-32
14.16-30
14.16-31
invalid format at ./trim.pl line 41. (41 is the die line)

Tony Heal
Pace Systems Group, Inc.
800-624-5999
[EMAIL PROTECTED]
 

> -Original Message-
> From: Chas Owens [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, August 21, 2007 9:50 AM
> To: [EMAIL PROTECTED]
> Cc: beginners@perl.org
> Subject: Re: regex help
> 
> On 8/21/07, Tony Heal <[EMAIL PROTECTED]> wrote:
> > Here is a sample of the versions that I am using.
> snip
> 
> Just to clarify, you have a version string with the following format:
> 
> {major}{custom tag}.{minor}-{build}
> 
> and you want the list sorted by major, then minor, then build.
> 
> #!/usr/bin/perl
> 
> use strict;
> use warnings;
> 
> my @versions;
> while () {
> chomp;
> die "invalid format" unless
> my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
> push @versions, [ $major, $minor, $build , $_];
> }
> 
> print "$_->[-1]\n" for sort {
> $a->[0] <=> $b->[0] or
> $a->[1] <=> $b->[1] or
> $a->[2] <=> $b->[2]
> } @versions;
> 
> __DATA__
> 16.1-17
> 16.1-22
> 16.1-23
> 16.1-39
> 16.3-1
> 16.3-6
> 16.3-7
> 16.3-8
> 16.3-15
> 16.5-1
> 16.5-2
> 16.5-10
> 16.5-13
> 15.3-12
> 15.2-108
> 14-special.1-2
> 14-special.1-8
> 14-special.1-15
> 14-special.2-40
> 14-special.2-41
> 14-special.3-4
> 14-special.3-7
> 14-special.3-12
> 15.2-110
> 15.2-111
> 15-special.1-52
> 15-special.1-53
> 15-special.1-54
> 16-special.4-9
> 16-special.4-10
> 16-special.5-1
> 16-special.5-2
> 16-special.6-6


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread Chas Owens

On 8/21/07, Tony Heal <[EMAIL PROTECTED]> wrote:
> OK I added this and I keep getting invalid format
>
> foreach (@newValues){print "$_\n";}
> my @versions;
> while (@newValues)
> {
> chomp;
> die "invalid format" unless
> my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
> push @versions, [ $major, $minor, $build , $_];
> }
> foreach (@versions){print "$_\n";}
> }
snip

That would be because the code makes no sense.  My example read the a
version at a time from the DATA file handle, transformed it, and
pushed it onto an array, then sorted the array and printed it.  Yours
has all of the versions in an array and tries to loop over the array
with a while loop (doesn't work to start with) and you never bother to
sort the data.  If you aren't reading from a file then you might as
well add the first loop back onto the Schwartzian transform (map ->
sort -> unmap).  Please note that

die "bad format" unless
my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;

is one statement and should be indented as above.  If you don't indent
it looks like the die and the assignment are unrelated.  If you find
the style confusing you may consider using this instead

my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/
or die "bad format";

#!/usr/bin/perl

use strict;
use warnings;

#I don't know how you are getting these values
my @newValues = map { chomp; $_ } ;

print "unsorted\n";
print "$_\n" for @newValues;

@newValues =
#unmap to recover the original data
map { $_->[0] }
#sort
sort {
$a->[1] <=> $b->[1] or
$a->[2] <=> $b->[2] or
$a->[3] <=> $b->[3]
}
#map into a sortable form
map {
die "bad format" unless
my ($major, $minor, $build) =
/(\d+)(?:-.+)?\.(\d+)-(\d+)/;
[$_, $major, $minor, $build]
}
@newValues;

print "sorted\n";
print "$_\n" for @newValues;

__DATA__
16.1-17
16.1-22
16.1-23
16.1-39
16.3-1
16.3-6
16.3-7
16.3-8
16.3-15
16.5-1
16.5-2
16.5-10
16.5-13
15.3-12
15.2-108
14-special.1-2
14-special.1-8
14-special.1-15
14-special.2-40
14-special.2-41
14-special.3-4
14-special.3-7
14-special.3-12
15.2-110
15.2-111
15-special.1-52
15-special.1-53
15-special.1-54
16-special.4-9
16-special.4-10
16-special.5-1
16-special.5-2
16-special.6-6

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread D. Bolliger

Tony Heal am Dienstag, 21. August 2007:
> > -Original Message-
> > From: Chas Owens [mailto:[EMAIL PROTECTED]
> > Sent: Tuesday, August 21, 2007 9:50 AM
> > To: [EMAIL PROTECTED]
> > Cc: beginners@perl.org
> > Subject: Re: regex help
> >
> > On 8/21/07, Tony Heal <[EMAIL PROTECTED]> wrote:
> > > Here is a sample of the versions that I am using.
> >
> > snip
> >
> > Just to clarify, you have a version string with the following format:
> >
> > {major}{custom tag}.{minor}-{build}
> >
> > and you want the list sorted by major, then minor, then build.
> >
> > #!/usr/bin/perl
> >
> > use strict;
> > use warnings;
> >
> > my @versions;
> > while () {
> > chomp;
> > die "invalid format" unless
> > my ($major, $minor, $build) =
> > /(\d+)(?:-.+)?\.(\d+)-(\d+)/; push @versions, [ $major, $minor, $build ,
> > $_];
> > }
> >
> > print "$_->[-1]\n" for sort {
> > $a->[0] <=> $b->[0] or
> > $a->[1] <=> $b->[1] or
> > $a->[2] <=> $b->[2]
> > } @versions;
> >
> > __DATA__
> > 16.1-17
[snip]
> > 16-special.4-10
> > 16-special.5-1
> > 16-special.5-2
> > 16-special.6-6

Hello Tony

Just include the original line in the die message to see what caused it (an 
empty line would for example). 
Based on that, you can then adapt the regex.

> OK I added this and I keep getting invalid format
>
> foreach (@newValues){print "$_\n";}
>   my @versions;
>   while (@newValues)
>   {
>   chomp;
>   die "invalid format" unless

die "invalid format of '$_'" unless

>   my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
>   push @versions, [ $major, $minor, $build , $_];
>   }
>   foreach (@versions){print "$_\n";}
> }
>
> /tmp# ./trim.pl
> 14.20-33
[snip]
> 14.16-31
> invalid format at ./trim.pl line 41. (41 is the die line)


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-22 Thread Mumia W..


On 08/21/2007 07:41 AM, Tony Heal wrote:

the list is a list of files by version. I need to keep the last 5 versions.

Jeff's code works fine except I am getting some empty strings at the beginning 
that I have not figured out.

Here is what I have so far. Lines 34 and 39 are provide a print out for 
troubleshooting. Once I get this fixed all I
need to do is shift the top five from the list and unlink the rest.

#!/usr/bin/perl

use warnings;
use strict;

opendir (REPOSITORY, '/usr/local/repository/dists/');
my @repositories = readdir (REPOSITORY);
closedir (REPOSITORY);

my $packageRepo;
my @values;
my @newValues;
foreach (@repositories)
{
$packageRepo = $_;
chomp ($packageRepo);
opendir (packageREPO, 
"/usr/local/repository/dists/$packageRepo/non-free/binary-i386");
my @repoFiles = readdir (packageREPO);
close (packageREPO);
foreach (@repoFiles)
{
my $fileName = $_;
chomp ($fileName);
if ( /(.*)(([0-9][0-9])(-special)?\.([0-9])(-)([0-9]*))(.*)/)
{
push (@values, $2);
}
}
my %h;
	foreach (@values) 
	{

push (@newValues, $_) unless $h{$_}++
}
foreach (@newValues){print "$_\n";}
	my @new = map { $_->[0] } 
	sort { $b->[1] <=> $a->[1] } 
	map { [$_,(split/-/)[-1]] } 
	@newValues;

print "@new[0..4]\n";
}


Or for a line numbered version
http://rafb.net/p/asqgJo27.html

Tony Heal





[oops, sent to the wrong list before]

Sort::Maker should make short work for this task ;-)

All you have to do is to make a regex to pull out the version numbers.
After that, you're practically done:

use strict;
use warnings;
require Sort::Maker;

open (pkgREPO, '<', 'data/versions-list.txt')
or die "no versions list: $!";

my @versions;

while (my $line = ) {
chomp $line;
push @versions, [ $line, $line =~ /^(\d+)(?:-[a-z]+)?\.(\d+)-(\d+)/ ];
}

close pkgREPO;

my $sorter = Sort::Maker::make_sorter(
'ST',
number => '$_->[1]',
number => '$_->[2]',
number => '$_->[3]',
);
die $@ unless $sorter;

my @sorted = $sorter->(@versions);
print "keep: $_->[0]\n" for @sorted[$#sorted-4 .. $#sorted];
print "delete: $_->[0]\n" for @sorted[0 .. $#sorted-5];

__END__

This is the output:

keep: 16.5-2
keep: 16-special.5-2
keep: 16.5-10
keep: 16.5-13
keep: 16-special.6-6
delete: 14-special.1-2
delete: 14-special.1-8
delete: 14-special.1-15
delete: 14-special.2-40
delete: 14-special.2-41
delete: 14-special.3-4
delete: 14-special.3-7
delete: 14-special.3-12
delete: 15-special.1-52
delete: 15-special.1-53
delete: 15-special.1-54
delete: 15.2-108
delete: 15.2-110
delete: 15.2-111
delete: 15.3-12
delete: 16.1-17
delete: 16.1-22
delete: 16.1-23
delete: 16.1-39
delete: 16.3-1
delete: 16.3-6
delete: 16.3-7
delete: 16.3-8
delete: 16.3-15
delete: 16-special.4-9
delete: 16-special.4-10
delete: 16.5-1
delete: 16-special.5-1






--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2007-09-03 Thread John W. Krahn


Beginner wrote:

Hi,


Hello,

I am trying to come up with a regex to squash multiple commas into 
one. The line I am working on looks like this:


SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,  
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , 

There are instances of /,\s{1,},/ and /,,/ 

The bit that I am struggling with is finding a way to get a use a 
multiplier for the regex /,\s+/ but I have to be careful not to 
remove single entries. I guess the order of my substitutions is 
important here.


$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, , 
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ];

> print;
s/,\s*(?=,)//g;
print;
'
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, , 
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , ,
SPEED OF LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL, 
CONCEPT,CONCEPTS,



$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, , 
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ];

print;
$_ = join ",", grep /\S/, split /,/;
print;
'
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, , 
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , ,
SPEED OF LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL, 
CONCEPT,CONCEPTS





John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: Regex help

2007-09-03 Thread Andrew Curry

Christ That's certainly 1 way ;) 

-Original Message-
From: John W. Krahn [mailto:[EMAIL PROTECTED] 
Sent: 03 September 2007 16:11
To: Perl beginners
Subject: Re: Regex help

Beginner wrote:
> Hi,

Hello,

> I am trying to come up with a regex to squash multiple commas into 
> one. The line I am working on looks like this:
> 
> SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, , 
> DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , ,
> 
> There are instances of /,\s{1,},/ and /,,/
> 
> The bit that I am struggling with is finding a way to get a use a 
> multiplier for the regex /,\s+/ but I have to be careful not to remove 
> single entries. I guess the order of my substitutions is important 
> here.

$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ];  >
print; s/,\s*(?=,)//g; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS,


$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ]; print;
$_ = join ",", grep /\S/, split /,/; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS




John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED] For additional
commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/



This e-mail is from the PA Group.  For more information, see
www.thepagroup.com.

This e-mail may contain confidential information.  Only the addressee is
permitted to read, copy, distribute or otherwise use this email or any
attachments.  If you have received it in error, please contact the sender
immediately.  Any opinion expressed in this e-mail is personal to the sender
and may not reflect the opinion of the PA Group.

Any e-mail reply to this address may be subject to interception or
monitoring for operational reasons or for lawful business practices.





-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: Regex help

2007-09-03 Thread Andrew Curry

Think

s/(\,+\s*)+/,/g;

Should work

It produces
SPEED OF LIGHT,LIGHT
SPEED,TRAVEL,TRAVELLING,DANGER,DANGEROUS,PHYSICAL,CONCEPT,CONCEPTS

If that's what you want. 

-Original Message-
From: Andrew Curry 
Sent: 03 September 2007 16:14
To: 'John W. Krahn'; Perl beginners
Subject: RE: Regex help

Christ That's certainly 1 way ;) 

-Original Message-
From: John W. Krahn [mailto:[EMAIL PROTECTED]
Sent: 03 September 2007 16:11
To: Perl beginners
Subject: Re: Regex help

Beginner wrote:
> Hi,

Hello,

> I am trying to come up with a regex to squash multiple commas into 
> one. The line I am working on looks like this:
> 
> SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, , 
> DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , ,
> 
> There are instances of /,\s{1,},/ and /,,/
> 
> The bit that I am struggling with is finding a way to get a use a 
> multiplier for the regex /,\s+/ but I have to be careful not to remove 
> single entries. I guess the order of my substitutions is important 
> here.

$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ];  >
print; s/,\s*(?=,)//g; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS,


$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ]; print;
$_ = join ",", grep /\S/, split /,/; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS




John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED] For additional
commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/



This e-mail is from the PA Group.  For more information, see
www.thepagroup.com.

This e-mail may contain confidential information.  Only the addressee is
permitted to read, copy, distribute or otherwise use this email or any
attachments.  If you have received it in error, please contact the sender
immediately.  Any opinion expressed in this e-mail is personal to the sender
and may not reflect the opinion of the PA Group.

Any e-mail reply to this address may be subject to interception or
monitoring for operational reasons or for lawful business practices.





-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: Regex help

2007-09-03 Thread Beginner

On 3 Sep 2007 at 16:15, Andrew Curry wrote:

> Think
> 
> s/(\,+\s*)+/,/g;
> 
> Should work
> 
> It produces
> SPEED OF LIGHT,LIGHT
> SPEED,TRAVEL,TRAVELLING,DANGER,DANGEROUS,PHYSICAL,CONCEPT,CONCEPTS
> 
> If that's what you want. 

Exactly what I want. Thanx,
Dp.




-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: Regex help

2007-09-03 Thread Beginner

On 3 Sep 2007 at 16:12, Andrew Curry wrote:

> $ perl -le'
> $_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
> DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ];  >
> print; s/,\s*(?=,)//g; print; '
> SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
> DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
> LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
> CONCEPT,CONCEPTS,
> 
> 
> $ perl -le'
> $_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
> DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ]; print;
> $_ = join ",", grep /\S/, split /,/; print; '
> SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
> DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
> LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
> CONCEPT,CONCEPTS
> 
> 
> 
> 
> John

Okay I need to ask what's going on here.  

I had to use the  

s/,\s*(?=,)//g  

expression because the  

s/(\,+\s*)+/,/g;  

regex in my code snip wasn't working as it did on the text snippet I 
originally supplied.  

=== code snip ===
 while () { 
chomp($_);  
s/"//g; 
s/\t/, /g;  
s/,\s*(?=,)//g; 
print "\"$_\"\n"; 
}
 == 

I can understand the 2nd method: A grouped, literal comma (\,), one 
or more times followed by a zero or more spaces.  

The 2nd regex reads to me like, a comma then zero or more spaces but 
what's that (?=,) doing? Is it referring to the preceding expression 
and saying if it matches up to 1 time? I can't see what the equal 
sign is doing either.

Enlightment please.
Dp.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2007-09-03 Thread John W. Krahn


Beginner wrote:

On 3 Sep 2007 at 16:12, Andrew Curry wrote:


Please do not attribute to Andrew Curry a post that was actually submitted by 
me (see my name at the end there.)  TIA




$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ];  >
print; s/,\s*(?=,)//g; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS,


$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ]; print;
$_ = join ",", grep /\S/, split /,/; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS




John


Okay I need to ask what's going on here.  

I had to use the  

s/,\s*(?=,)//g  

expression because the  

s/(\,+\s*)+/,/g;  

regex in my code snip wasn't working as it did on the text snippet I 
originally supplied.  


"wasn't working" is not a very good description of the problem.



=== code snip ===
 while () {   
chomp($_);  


Why remove the newline and then add it back at the end of the loop?


s/"//g;


It is more efficient to use transliteration to remove characters from a string:

tr/"//d;


s/\t/, /g;  
s/,\s*(?=,)//g; 
	print "\"$_\"\n"; 


You could use different quoting so you don't have to escape the quotation marks:

 print qq["$_"\n];


}
 == 

I can understand the 2nd method: A grouped, literal comma (\,), one 
or more times followed by a zero or more spaces.  

The 2nd regex reads to me like, a comma then zero or more spaces but 
what's that (?=,) doing?


It is a zero-width positive look-ahead assertion.  It says that a comma *must* 
follow the pattern but is not included as part of the pattern.



Is it referring to the preceding expression 
and saying if it matches up to 1 time? I can't see what the equal 
sign is doing either.


Enlightment please.


perldoc perlre




John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

1 2 >

1 - 100 of 191 matches

Mail list logo