subject:"Re\: Regex help"

Re: regex help - only one value returned

2020-12-02 Thread Jim Gibson

In your original example:

print "match1='$1' '$2'\n" if ($T=~/^((mr|mrs|miss|dr|prof|sir) .{5,}?)\n/smi);
print "match2='$1' '$2'\n" if ($T=~/^(mr|mrs|miss|dr|prof|sir .{5,}?)\n/smi);

the interior parentheses in example one terminates the alternation, so the last 
string is ’sir’.

In example two, the alternation is not terminated until the first ‘)', so the 
last string is ’sir .{5,}?’. followed in the regular expression by the “\n” 
character. Since in $T ‘miss’ is not followed by an \n, the match fails. Vlado 
has explained how to group and terminate the alternation without capturing the 
match result.

> On Dec 2, 2020, at 6:08 AM, Gary Stainburn  
> wrote:
> 
> On 02/12/2020 13:56, Vlado Keselj wrote:
>> Well, it seems that the first one is what you want, but you just need to
>> use $1 and ignore $2.
>> 
>> You do need parentheses in '(mr|mrs|miss|dr|prof|sir)' but if you do not
>> want for them to be captured in $2, you can use:
>> '(?:mr|mrs|miss|dr|prof|sir)'.  For example:
>> 
>> print "match3='$1' '$2'\n" if
>> ($T=~/^((?:mr|mrs|miss|dr|prof|sir) .{5,}?)\n/smi);
>> 
>> would give output:
>> 
>> match3='Miss Jayne Doe' ''
> Perfect, thank you.
> 
> I can't ignore $2 as it's in a loop with other regex that genuinely returns 
> multiple matches.  The amendment to the REGEX worked perfectly.

It is always best to save the results of a match with capturing in another 
variable. The capturing variables $1, $2, etc. are not reassigned if a match 
fails, so if you use them after a failed match, they will be the values left 
over from a previous match. So do this:

my $salutation = $1;
my $name = $2;

If you don’t want a possible undefined value, so this instead:

my $name = $2 || '';

Jim Gibson
j...@gibson.org

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex help - only one value returned

2020-12-02 Thread Gary Stainburn


On 02/12/2020 13:56, Vlado Keselj wrote:

Well, it seems that the first one is what you want, but you just need to
use $1 and ignore $2.

You do need parentheses in '(mr|mrs|miss|dr|prof|sir)' but if you do not
want for them to be captured in $2, you can use:
'(?:mr|mrs|miss|dr|prof|sir)'.  For example:

print "match3='$1' '$2'\n" if
($T=~/^((?:mr|mrs|miss|dr|prof|sir) .{5,}?)\n/smi);

would give output:

match3='Miss Jayne Doe' ''

Perfect, thank you.

I can't ignore $2 as it's in a loop with other regex that genuinely 
returns multiple matches.  The amendment to the REGEX worked perfectly.


Gary

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex help - only one value returned

2020-12-02 Thread Vlado Keselj



Well, it seems that the first one is what you want, but you just need to 
use $1 and ignore $2.

You do need parentheses in '(mr|mrs|miss|dr|prof|sir)' but if you do not 
want for them to be captured in $2, you can use:
'(?:mr|mrs|miss|dr|prof|sir)'.  For example:

print "match3='$1' '$2'\n" if
($T=~/^((?:mr|mrs|miss|dr|prof|sir) .{5,}?)\n/smi);

would give output:

match3='Miss Jayne Doe' ''

On Wed, 2 Dec 2020, Gary Stainburn wrote:

> I have an array of regex expressions that I apply to text returned from
> tesseract.
> 
> Each match that I get then gets stored for future processing. However, I'm
> struggling with one regex.
> 
> The problem is that:
> 
> 1) with brackets round the titles it returns two matches.
> 2) without brackets, it returns nothing.
> 
> Can anyone point me at the correct syntax please.
> 
> Gary
> 
> [root@dev dev]# ./t
> match1='Miss Jayne Doe' 'Miss'
> [root@dev dev]# cat t
> #!/usr/bin/perl
> 
> use strict;
> use warnings;
> 
> my $T=< Customer name and address
> Miss Jayne Doe
> 19 Their Street
> Somewehere
> In Yorkshire
> IN1 3YY
> EOF
> 
> print "match1='$1' '$2'\n" if ($T=~/^((mr|mrs|miss|dr|prof|sir)
> .{5,}?)\n/smi);
> print "match2='$1' '$2'\n" if ($T=~/^(mr|mrs|miss|dr|prof|sir .{5,}?)\n/smi);
> [root@dev dev]#
> 
> -- 
> To unsubscribe, e-mail: beginners-unsubscr...@perl.org
> For additional commands, e-mail: beginners-h...@perl.org
> http://learn.perl.org/
> 
> 

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: Regex help needed

2013-01-09 Thread *Shaji Kalidasan*

Punit Jain,

This is not the optimized code but you can refactor it. This works for the 
given scenario, no matter the order of input data.

Hope it helps to some extent.

[code]
my $var = '';
my @args = ();
my %hash;

while (DATA) {
chomp;
my ($var,$arg) = split /=/,$_,2;
if($var eq '{') {
@args = (); #Reset if we encounter '{'
}
my @arg1 = split /,/,$arg if defined $arg;
if(scalar @arg1  scalar @args) {
            $hash{$var} = $arg unless($var eq '{' || $var eq '}');
            @args = @arg1;
}
}

foreach my $k (sort keys %hash) {
print $k = $hash{$k}\n;
}

__DATA__
{
test = (test123);
test = (test123,abc,xyz);
test = (test123,abc);
}
{
test1 = (passfile,pasfile1,user);
test1 = (passfile);
test1 = (passfile,pasfile1);
}
{
test2 = (temp);
test2 = (temp,temp1);
test2 = (temp,temp1,username);
}
{
test3 = (betty,betty1,jack);
test3 = (betty,betty1);
test3 = (betty);
}
[/code]

[output]
test  =  (test123,abc,xyz);
test1  =  (passfile,pasfile1,user);
test2  =  (temp,temp1,username);
test3  =  (betty,betty1,jack);
[/output]
 
best,
Shaji 
---
Your talent is God's gift to you. What you do with it is your gift back to God.
---



 From: punit jain contactpunitj...@gmail.com
To: beginners@perl.org beginners@perl.org 
Sent: Tuesday, 8 January 2013 5:58 PM
Subject: Regex help needed
 
Hi ,

I have a file as below : -

{
test = (test123);
test = (test123,abc);
test = (test123,abc,xyz);
}
{
test1 = (passfile);
test1 = (passfile,pasfile1);
test1 = (passfile,pasfile1,user);
}

and so on 

The requirement is to have the file parsing so that final output is  :-

test = (test123,abc,xyz);
test1 = (passfile,pasfile1,user);

So basically only pick the lines with maximum number of options for each
type.

Regards.

Re: Regex help needed

2013-01-09 Thread Dr.Ruud


On 2013-01-08 13:28, punit jain wrote:


{
test = (test123);
test = (test123,abc);
test = (test123,abc,xyz);
}
{
test1 = (passfile);
test1 = (passfile,pasfile1);
test1 = (passfile,pasfile1,user);
}

and so on 

The requirement is to have the file parsing so that final output is  :-

test = (test123,abc,xyz);
test1 = (passfile,pasfile1,user);

So basically only pick the lines with maximum number of options for each
type.


Or just print the last long line:

echo '{
test = (test123);
test = (test123,abc);
test = (test123,abc,xyz);
}
{
test1 = (passfile);
test1 = (passfile,pasfile1);
test1 = (passfile,pasfile1,user);
}
' |perl -wne'$o=$n||0;$p=$_,next if($n=length)$o;$n=3;print$p'

test = (test123,abc,xyz);
test1 = (passfile,pasfile1,user);


Which preserves order too. :)

--
Ruud


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: Regex help needed

2013-01-08 Thread Jim Gibson


On Jan 8, 2013, at 4:28 AM, punit jain wrote:

 Hi ,
 
 I have a file as below : -
 
 {
 test = (test123);
 test = (test123,abc);
 test = (test123,abc,xyz);
 }
 {
 test1 = (passfile);
 test1 = (passfile,pasfile1);
 test1 = (passfile,pasfile1,user);
 }
 
 and so on 
 
 The requirement is to have the file parsing so that final output is  :-
 
 test = (test123,abc,xyz);
 test1 = (passfile,pasfile1,user);
 
 So basically only pick the lines with maximum number of options for each
 type.

The easiest solution I can think of would be to extract the first token on each 
line, use that token as a hash key, count the number of commas in each line, 
and save the line in the hash with the largest number of commas for each key. 

This will not work if your strings have commas. In that case, you might want to 
consider using a parsing module, such as Text::CSV, that will correctly handle 
your input data. You can use Text::CSV to split your input lines into fields 
and count the number of fields. However, you will first have to extract the 
quoted strings from the surrounding parentheses. You can use the Text::Balanced 
module to do that. Both Text::CSV and Text::Balanced are available at CPAN 
(http;//search.cpan.org).

The best way for you to learn programming will be to attempt writing a program 
to accomplish your task, then post your program if you have trouble getting it 
to do what you want.

Good luck.



--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: Regex help needed

2013-01-08 Thread timothy adigun

Hi punit jain,

 Please check my comments below.

On Tue, Jan 8, 2013 at 1:28 PM, punit jain contactpunitj...@gmail.comwrote:

 Hi ,

 I have a file as below : -

 {
 test = (test123);
 test = (test123,abc);
 test = (test123,abc,xyz);
 }
 {
 test1 = (passfile);
 test1 = (passfile,pasfile1);
 test1 = (passfile,pasfile1,user);
 }

 and so on 

 The requirement is to have the file parsing so that final output is  :-

 test = (test123,abc,xyz);
 test1 = (passfile,pasfile1,user);

 So basically only pick the lines with maximum number of options for each
 type.

 Regards.


I basically agreed with Jim on this:
Jim  to learn programming will be to attempt writing a program to
accomplish your task, Jim  then post your program if you have trouble
getting it to do what you want.

However, if I may suggest using hash, if the lines with the maximum number
of options for each type *is the last one in each case*. Since, *hash will
only permit only one key*. So, splitting each line on =, one can take key
and value for hash.

So, based on the data presented, one can write like so:

use warnings;
use strict;

my %collection_hash;

while (DATA) {
chomp;
if (/=/) {
my ( $key, $value ) = split /=/, $_, 2;
$collection_hash{$key} = $value;
}
}

print $_, ' = ', $collection_hash{$_}, $/ for sort keys %collection_hash;

__DATA__
{
test = (test123);
test = (test123,abc);
test = (test123,abc,xyz);
}
{
test1 = (passfile);
test1 = (passfile,pasfile1);
test1 = (passfile,pasfile1,user);
}


*OUTPUT:*
test  =  (test123,abc,xyz);
test1  =  (passfile,pasfile1,user);

Please, *NOTE* that this will only work as you want if the last line in
each case has the maximum options, this is what the data you showed here
presented.





-- 
Tim

Re: Regex help

2012-12-22 Thread Paul Johnson

On Sat, Dec 22, 2012 at 04:45:21PM +0530, punit jain wrote:
 Hi,
 
 I have a file like below : -
 
 BEGIN:VCARD
 VERSION:2.1
 EMAIL:te...@test.com
 FN:test1
 REV:20101116T030833Z
 UID:644938456.1419.
 END:VCARD
 
 From (S___-0003) Tue Nov 16 03:10:15 2010
 content-class: urn:content-classes:person
 Date: Tue, 16 Nov 2010 11:10:15 +0800
 Subject: test
 Message-ID: 644938507.1420
 MIME-Version: 1.0
 Content-Type: text/x-vcard; charset=utf-8
 
 BEGIN:VCARD
 VERSION:2.1
 EMAIL:te...@test.com
 FN:test2
 REV:20101116T031015Z
 UID:644938507.1420
 END:VCARD
 
 
 
 My requirement is to get all text between BEGIN:VCARD and END:VCARD and all
 the instances. So o/p should be :-
 
 BEGIN:VCARD
 VERSION:2.1
 EMAIL:te...@test.com
 FN:test1
 REV:20101116T030833Z
 UID:644938456.1419.
 END:VCARD
 
 BEGIN:VCARD
 VERSION:2.1
 EMAIL:te...@test.com
 FN:test2
 REV:20101116T031015Z
 UID:644938507.1420
 END:VCARD
 
 I am using below regex  :-
 
 my $fh = IO::File-new($file, r);
 my $script = do { local $/; $fh };
 close $fh;
 if (
$script =~ m/
 (^BEGIN:VCARD\s*(.*)
 ^END:VCARD\s+)/sgmix
 ){
 print OUTFILE $1.\n;
 }
 
 However it just prints 1st instance and not all.

It also prints the text between the two instances, right?

 Any suggestions ?

You need a non greedy match .*? instead of the greedy match .* that you
are using.  Then you'll need to use while instead of if.

Or perhaps you'd prefer:

 $ perl -ne 'print if /BEGIN:VCARD/ .. /END:VCARD/'  in  out

or

 $ perl -n00e 'print if /^BEGIN:VCARD/'  in  out

See perldoc perlrun for the switches and Range Operators from perdoc
perlop for ..

-- 
Paul Johnson - p...@pjcj.net
http://www.pjcj.net

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: Regex help

2012-12-22 Thread David Precious

On Sat, 22 Dec 2012 16:45:21 +0530
punit jain contactpunitj...@gmail.com wrote:

 Hi,
 
 I have a file like below : -

[snipped example - vcards with mail headers etc in between]


 My requirement is to get all text between BEGIN:VCARD and END:VCARD
 and all the instances. So o/p should be :-
[...]
 I am using below regex  :-
[...]
 Any suggestions ?

You've already had a reply indicating how to solve the problem you were
having with regexes, so I won't touch on that.

What I will advise, is that for any task you're trying to accomplish,
there's a pretty good chance someone has already solved that and made
code available on CPAN that will help you - so always check CPAN first,
to avoid unnecessarily reinventing the wheel each time (unless you're
doing so solely for a learning experience, of course).

In this case, parsing vcards is likely a common task - a quick look on
CPAN turns up Text::vCard::Addressbook:

https://metacpan.org/module/Text::vCard::Addressbook


From the synopsis:

  use Text::vCard::Addressbook;
 
  my $address_book = Text::vCard::Addressbook-new(
  { 'source_file' = '/path/to/address.vcf', } );
 
  foreach my $vcard ( $address_book-vcards() ) {
  print Got card for  . $vcard-fullname() . \n;
  }

It will ignore the non-vcard content in the example you provided, and
just provide you easy access to the data from each vcard.

That's a much nicer approach than extracting it yourself with regexes.

Cheers

Dave P


-- 
David Precious (bigpresh) dav...@preshweb.co.uk
http://www.preshweb.co.uk/ www.preshweb.co.uk/twitter
www.preshweb.co.uk/linkedinwww.preshweb.co.uk/facebook
www.preshweb.co.uk/cpanwww.preshweb.co.uk/github



--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: Regex help

2012-12-22 Thread Rob Dixon


On 22/12/2012 11:15, punit jain wrote:

Hi,

I have a file like below : -

BEGIN:VCARD
VERSION:2.1
EMAIL:te...@test.com
FN:test1
REV:20101116T030833Z
UID:644938456.1419.
END:VCARD

 From (S___-0003) Tue Nov 16 03:10:15 2010
content-class: urn:content-classes:person
Date: Tue, 16 Nov 2010 11:10:15 +0800
Subject: test
Message-ID: 644938507.1420
MIME-Version: 1.0
Content-Type: text/x-vcard; charset=utf-8

BEGIN:VCARD
VERSION:2.1
EMAIL:te...@test.com
FN:test2
REV:20101116T031015Z
UID:644938507.1420
END:VCARD



My requirement is to get all text between BEGIN:VCARD and END:VCARD and all
the instances. So o/p should be :-

BEGIN:VCARD
VERSION:2.1
EMAIL:te...@test.com
FN:test1
REV:20101116T030833Z
UID:644938456.1419.
END:VCARD

BEGIN:VCARD
VERSION:2.1
EMAIL:te...@test.com
FN:test2
REV:20101116T031015Z
UID:644938507.1420
END:VCARD

I am using below regex  :-

my $fh = IO::File-new($file, r);
my $script = do { local $/; $fh };
 close $fh;
 if (
$script =~ m/
 (^BEGIN:VCARD\s*(.*)
 ^END:VCARD\s+)/sgmix
 ){
 print OUTFILE $1.\n;
 }

However it just prints 1st instance and not all.

Any suggestions ?


This is very simply done with Perl's range operator. See the program
below.

Rob


use strict;
use warnings;

open my $fh, '', 'vcard.txt' or die $!;

while ($fh) {
  print if /^BEGIN:VCARD/ .. /^END:VCARD/;
}

**output**

BEGIN:VCARD
VERSION:2.1
EMAIL:te...@test.com
FN:test1
REV:20101116T030833Z
UID:644938456.1419.
END:VCARD
BEGIN:VCARD
VERSION:2.1
EMAIL:te...@test.com
FN:test2
REV:20101116T031015Z
UID:644938507.1420
END:VCARD


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex help

2011-10-28 Thread Chris Stinemetz

On Wed, Oct 19, 2011 at 1:10 AM, Leo Susanto leosusa...@gmail.com wrote:

 use strict;
 my %CELL;
 my %CELL_TYPE_COUNT;
 my $timestamp;
 my $hour;
 while (my $line = DATA) {
if ($line =~ m|\d{1,2}/\d{1,2}/\d{2} ((\d{1,2}):\d{1,2}:\d{1,2})|) {
 #10/17/11 18:25:20 #578030
$timestamp = $1;
$hour = $2;
}

if ($line =~ /CELL\s+(\d+)\s+(.+?),.+?HEH/) { # take CELL number
 into
 $1 and the information after the number (and before the first comma)
 into $2
if ((17 = $hour)($hour =21)) {
$CELL{$hour}{$1}{$2}++;
$CELL_TYPE_COUNT{$2}++;
}
}
 }




Would someone help me understand what this block of code is doing after the
if condition? Is it utilizing references and counting the occurences of the
keys?
I am having some trouble digesting it.

Thanks for the clarification,

Chris

Re: regex help

2011-10-28 Thread Jim Gibson

On 10/28/11 Fri  Oct 28, 2011  2:15 PM, Chris Stinemetz
chrisstinem...@gmail.com scribbled:

 On Wed, Oct 19, 2011 at 1:10 AM, Leo Susanto leosusa...@gmail.com wrote:
 
 use strict;
 my %CELL;
 my %CELL_TYPE_COUNT;
 my $timestamp;
 my $hour;
 while (my $line = DATA) {
if ($line =~ m|\d{1,2}/\d{1,2}/\d{2} ((\d{1,2}):\d{1,2}:\d{1,2})|) {
 #10/17/11 18:25:20 #578030
$timestamp = $1;
$hour = $2;
}
 
if ($line =~ /CELL\s+(\d+)\s+(.+?),.+?HEH/) { # take CELL number
 into
 $1 and the information after the number (and before the first comma)
 into $2
if ((17 = $hour)($hour =21)) {
$CELL{$hour}{$1}{$2}++;
$CELL_TYPE_COUNT{$2}++;
}
}
 }
 
 
 
 
 Would someone help me understand what this block of code is doing after the
 if condition? Is it utilizing references and counting the occurences of the
 keys?
 I am having some trouble digesting it.

The best way to figure what it is doing is to print out the values of $hour,
$1, and $2, $CELL{$hour}{$1}{$2}, and $CELL_TYPE_COUNT{$2} before and after
the if statement block.

You should be able to combine those two regular expression being applied to
$line, but I would need to see typical data lines to make sure and how to do
that.

It looks like it is counting cells and cell types, judging from the names of
the variables and the comments.



-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex help

2011-10-19 Thread Leo Susanto

use strict;
my %CELL;
my %CELL_TYPE_COUNT;
my $timestamp;
my $hour;
while (my $line = DATA) {
if ($line =~ m|\d{1,2}/\d{1,2}/\d{2} ((\d{1,2}):\d{1,2}:\d{1,2})|) {
#10/17/11 18:25:20 #578030
$timestamp = $1;
$hour = $2;
}

if ($line =~ /CELL\s+(\d+)\s+(.+?),.+?HEH/) { # take CELL number into
$1 and the information after the number (and before the first comma)
into $2
if ((17 = $hour)($hour =21)) {
$CELL{$hour}{$1}{$2}++;
$CELL_TYPE_COUNT{$2}++;
}
}
}

# header
print HOUR, CELL,.join(, ,sort keys %CELL_TYPE_COUNT).\n;
# body
foreach my $hour (sort keys %CELL) { # you can use map function, but
it never sits well on my brain
foreach my $cellNo (sort keys %{$CELL{$hour}}) {
print $hour, $cellNo;
foreach my $info (sort keys %CELL_TYPE_COUNT) {
if (exists $CELL{$hour}{$cellNo}{$info}) {
print , $CELL{$hour}{$cellNo}{$info};
}
else {
print , 0;
}
}
print \n;
}
}


__DATA__
10/17/11 10:25:20 #578030

 25 REPT:CELL 221 CDM 2, CRC, HEH
SUPPRESSED MSGS: 0
ERROR TYPE: ONEBTS MODULAR CELL ERROR
SET: MLG BANDWIDTH CHANGE
MLG 1 BANDWIDTH = 1536

  00  00  06  00  00  00  00  00
  00  00  00  00  00  00  00  00
  00  00  00  00

10/17/11 18:25:20 #578031

 25 REPT:CELL 221 CDM 2, CRC, HEH
SUPPRESSED MSGS: 0
ERROR TYPE: ONEBTS MODULAR CELL ERROR
SET: DS1-MLG ASSOCIATION CHANGE
MLG 1 DS1 1,2

  00  00  00  00  00  00  00  00
  03  00  00  00  01  00  05  05

#my own test data
10/17/11 18:25:20 #578031
 25 REPT:CELL 220 CDM 1, CRC, HEH
10/17/11 18:25:20 #578031
 25 REPT:CELL 220 CDM 1, CRC, HEH
10/17/11 19:25:20 #578031
 25 REPT:CELL 220 CDM 1, CRC, HEH

On Tue, Oct 18, 2011 at 1:16 AM, Chris Stinemetz
chrisstinem...@gmail.com wrote:
 On Mon, Oct 17, 2011 at 10:57 PM, Leo Susanto leosusa...@gmail.com wrote:
 From looking at the regex

  if ($line =~ 
 /17|18|19|20|21+:(\d+):(\d+)+\n+\n+CELL\s+(\d+)\s+(.+?),.+?HEH/){

 against the data

 10/17/11 18:25:20 #578030

  25 REPT:CELL 221 CDM 2, CRC, HEH
     SUPPRESSED MSGS: 0
     ERROR TYPE: ONEBTS MODULAR CELL ERROR
     SET: MLG BANDWIDTH CHANGE
     MLG 1 BANDWIDTH = 1536

 I would assume $1 and $2 wouldn't match to anything plus $5 doesn't exist.

 Could you please let us know which part of the data you want to extract?

 Fill in the blanks
 $1=
 $2=
 $3=
 $4=
 $5=


 Thanks everyone. I hope this clarifies what I am trying to match. For
 example with this input:

 10/17/11 18:25:20 #578030

  25 REPT:CELL 221 CDM 2, CRC, HEH
     SUPPRESSED MSGS: 0
     ERROR TYPE: ONEBTS MODULAR CELL ERROR
     SET: MLG BANDWIDTH CHANGE
     MLG 1 BANDWIDTH = 1536


 $1= Match the time stamp Hour:Min:Sec only if the hour is = 17 and hour = 21
 $2= capture CELL number
 $3= capture the information after the CELL number (and before the first comma)

 Thank you,

 Chris


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex help

2011-10-18 Thread John W. Krahn


Brian Fraser wrote:

On Tue, Oct 18, 2011 at 12:32 AM, Chris Stinemetz
chrisstinem...@gmail.comwrote:


/17|18|19|20|21+:(\d+):(\d+)+\n+\n+CELL\s+(\d+)\s+(.+?),.+?HEH/



Spot the issue:
/
  17#Or
| 18#Or
| 19#Or
| 20#Or
| 21+:(\d+):(\d+)+\n+\n+CELL\s+(\d+)\s+(.+?),.+?HEH
/x

For anything but 21, the regex is only two numbers! You need to enclose the
alternatives in () or (?:), depending on whenever you want to capture them
or not.

That aside, please be very mindful that \d and . are both code smells. The
former will match much, much more than just [0-9] -- grab the unichars[0]
program from Unicode::Tussle[1] if you want to see for yourself. Either use
the /a switch (or the more localized form (?a:), bot available in newer
Perls), or [0-9], or \p{PosixDigit}, or (your favorite way here. TIMTOWTDI
applies).

The dot is also problematic. You aren't using the /s switch, so it actually
matches [^\n].


Correct.


Is that what you want? Are you certain that no one is going
to come and, after reading Perl Best Practices, will try to helpfully but
wrongly add the /smx flags and screw up your regex?


It doesn't really matter because the regular expression is matched 
against a line (via readline) and as such will only contain one 
newline and that newline will be at the end of the line so it will 
match the same with or without the /s option.


Also, as the regular expression does not contain either the ^ anchor or 
the $ anchor it will match the same with or without the /m option.




John
--
Any intelligent fool can make things bigger and
more complex... It takes a touch of genius -
and a lot of courage to move in the opposite
direction.   -- Albert Einstein

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex help

2011-10-18 Thread John W. Krahn


Chris Stinemetz wrote:

Hello,


Hello,


[*SNIP*]


#!/usr/bin/perl

use warnings;
use strict;
use POSIX;

# my $filepath =
sprintf(/omp/omp-data/logs/OMPROP1/%s.APX,strftime(%y%m%d%H,localtime));
my $filepath = (/tmp/110923.APX); # for testing

my $runTime = 
sprintf(/home/cstinemetz/programs/%s.txt,strftime(%Y-%m-%d-%H:%M,localtime));

my $fileDate = strftime(%y%m%d%H%,localtime);

open my $fh, '', $filepath or die ERROR opening $filepath: $!;
open my $out, '', $runTime or die ERROR opening $runTime: $!;

my %date;
my %cell;
my %heh_type_count;
while (my $line =$fh) {
   if ($line =~ 
/17|18|19|20|21+:(\d+):(\d+)+\n+\n+CELL\s+(\d+)\s+(.+?),.+?HEH/){


As you haven't changed the Input Record Separator the code my $line 
=$fh will read one line from the file and that line will have only 
one newline at the end of the line so /\n+\n+/ in the middle of your 
regular expression will never match.



John
--
Any intelligent fool can make things bigger and
more complex... It takes a touch of genius -
and a lot of courage to move in the opposite
direction.   -- Albert Einstein

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex help

2011-10-18 Thread Chris Stinemetz

On Mon, Oct 17, 2011 at 10:57 PM, Leo Susanto leosusa...@gmail.com wrote:
 From looking at the regex

  if ($line =~ 
 /17|18|19|20|21+:(\d+):(\d+)+\n+\n+CELL\s+(\d+)\s+(.+?),.+?HEH/){

 against the data

 10/17/11 18:25:20 #578030

  25 REPT:CELL 221 CDM 2, CRC, HEH
     SUPPRESSED MSGS: 0
     ERROR TYPE: ONEBTS MODULAR CELL ERROR
     SET: MLG BANDWIDTH CHANGE
     MLG 1 BANDWIDTH = 1536

 I would assume $1 and $2 wouldn't match to anything plus $5 doesn't exist.

 Could you please let us know which part of the data you want to extract?

 Fill in the blanks
 $1=
 $2=
 $3=
 $4=
 $5=


Thanks everyone. I hope this clarifies what I am trying to match. For
example with this input:

10/17/11 18:25:20 #578030

  25 REPT:CELL 221 CDM 2, CRC, HEH
 SUPPRESSED MSGS: 0
 ERROR TYPE: ONEBTS MODULAR CELL ERROR
 SET: MLG BANDWIDTH CHANGE
 MLG 1 BANDWIDTH = 1536


$1= Match the time stamp Hour:Min:Sec only if the hour is = 17 and hour = 21
$2= capture CELL number
$3= capture the information after the CELL number (and before the first comma)

Thank you,

Chris

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex help

2011-10-18 Thread Igor Dovgiy

Maybe this'll be helpful. )

my $time_rx = qr/(?timestamp
   (?hour \d{2} )
   (?: :\d{2} ){2}
 )
/x;

my $cell_record_rx = qr/CELL
  \s+
  (?cell_number \d+)
  \s+
  (?cell_info [^,]+)
 /x;

my $records_ref;
my $record_ts;
while () {
  if ($record_ts) {
# looking for record data of this particular timestamp
if (/$cell_record_rx/) {
  ++$records_ref-{$record_ts}{ $+{cell_number} }{ $+{cell_info} };
  undef $record_ts;
}
  }
  else {
#scanning for next valid record
if (/$time_rx/
 $+{hour} = 17  $+{hour} = 21) {
  $record_ts = $+{timestamp};
}
  }
}

-- iD

2011/10/18 Chris Stinemetz chrisstinem...@gmail.com

 On Mon, Oct 17, 2011 at 10:57 PM, Leo Susanto leosusa...@gmail.com
 wrote:
  From looking at the regex
 
   if ($line =~
 /17|18|19|20|21+:(\d+):(\d+)+\n+\n+CELL\s+(\d+)\s+(.+?),.+?HEH/){
 
  against the data
 
  10/17/11 18:25:20 #578030
 
   25 REPT:CELL 221 CDM 2, CRC, HEH
  SUPPRESSED MSGS: 0
  ERROR TYPE: ONEBTS MODULAR CELL ERROR
  SET: MLG BANDWIDTH CHANGE
  MLG 1 BANDWIDTH = 1536
 
  I would assume $1 and $2 wouldn't match to anything plus $5 doesn't
 exist.
 
  Could you please let us know which part of the data you want to extract?
 
  Fill in the blanks
  $1=
  $2=
  $3=
  $4=
  $5=
 

 Thanks everyone. I hope this clarifies what I am trying to match. For
 example with this input:

 10/17/11 18:25:20 #578030

  25 REPT:CELL 221 CDM 2, CRC, HEH
 SUPPRESSED MSGS: 0
 ERROR TYPE: ONEBTS MODULAR CELL ERROR
 SET: MLG BANDWIDTH CHANGE
 MLG 1 BANDWIDTH = 1536


 $1= Match the time stamp Hour:Min:Sec only if the hour is = 17 and hour =
 21
 $2= capture CELL number
 $3= capture the information after the CELL number (and before the first
 comma)

 Thank you,

 Chris

 --
 To unsubscribe, e-mail: beginners-unsubscr...@perl.org
 For additional commands, e-mail: beginners-h...@perl.org
 http://learn.perl.org/

Re: regex help

2011-10-17 Thread Leo Susanto

From looking at the regex

  if ($line =~ 
 /17|18|19|20|21+:(\d+):(\d+)+\n+\n+CELL\s+(\d+)\s+(.+?),.+?HEH/){

against the data

 10/17/11 18:25:20 #578030

  25 REPT:CELL 221 CDM 2, CRC, HEH
     SUPPRESSED MSGS: 0
     ERROR TYPE: ONEBTS MODULAR CELL ERROR
     SET: MLG BANDWIDTH CHANGE
     MLG 1 BANDWIDTH = 1536

I would assume $1 and $2 wouldn't match to anything plus $5 doesn't exist.

Could you please let us know which part of the data you want to extract?

Fill in the blanks
$1=
$2=
$3=
$4=
$5=


On Mon, Oct 17, 2011 at 8:32 PM, Chris Stinemetz
chrisstinem...@gmail.com wrote:
 Hello,

 I am getting the following error when I am trying to use regex to
 match a pattern and then access the memory variables:

 Any insight as to what I am doing wrong is greatly appreciated.

 Thank you,

 Chris

 Use of uninitialized value $1 in hash element at ./heh.pl line 22,
 $fh line 1211.
 Use of uninitialized value $4 in hash element at ./heh.pl line 22,
 $fh line 1211.
 Use of uninitialized value $5 in hash element at ./heh.pl line 22,
 $fh line 1211.

 An example of what I am trying to match is:

 10/17/11 18:25:20 #578030

  25 REPT:CELL 221 CDM 2, CRC, HEH
     SUPPRESSED MSGS: 0
     ERROR TYPE: ONEBTS MODULAR CELL ERROR
     SET: MLG BANDWIDTH CHANGE
     MLG 1 BANDWIDTH = 1536

       00  00  06  00  00  00  00  00
       00  00  00  00  00  00  00  00
       00  00  00  00



 10/17/11 18:25:20 #578031

  25 REPT:CELL 221 CDM 2, CRC, HEH
     SUPPRESSED MSGS: 0
     ERROR TYPE: ONEBTS MODULAR CELL ERROR
     SET: DS1-MLG ASSOCIATION CHANGE
     MLG 1 DS1 1,2

       00  00  00  00  00  00  00  00
       03  00  00  00  01  00  05  05


 My program:

 #!/usr/bin/perl

 use warnings;
 use strict;
 use POSIX;

 # my $filepath =
 sprintf(/omp/omp-data/logs/OMPROP1/%s.APX,strftime(%y%m%d%H,localtime));
 my $filepath = (/tmp/110923.APX);     # for testing

 my $runTime = 
 sprintf(/home/cstinemetz/programs/%s.txt,strftime(%Y-%m-%d-%H:%M,localtime));

 my $fileDate = strftime(%y%m%d%H%,localtime);

 open my $fh, '', $filepath or die ERROR opening $filepath: $!;
 open my $out, '', $runTime or die ERROR opening $runTime: $!;

 my %date;
 my %cell;
 my %heh_type_count;
 while (my $line = $fh) {
  if ($line =~ 
 /17|18|19|20|21+:(\d+):(\d+)+\n+\n+CELL\s+(\d+)\s+(.+?),.+?HEH/){
    $cell{$1}{$4}{$5}++;
    $heh_type_count{$5}++;
  }
 }

 # header
 print HOUR\t.CELL\t.join(\t,sort keys %heh_type_count).\n;
 # body
 foreach my $cellNo (sort {$a = $b} keys %cell) {
  print $cellNo;
  foreach my $heh_hits (sort keys %heh_type_count) {
    if (exists $cell{$cellNo}{$heh_hits}) {
      print \t $cell{$cellNo}{$heh_hits};
    }
    else {
      print \t 0;
    }
  }
  print \n;
 }

 --
 To unsubscribe, e-mail: beginners-unsubscr...@perl.org
 For additional commands, e-mail: beginners-h...@perl.org
 http://learn.perl.org/




--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex help

2011-10-17 Thread Brian Fraser

On Tue, Oct 18, 2011 at 12:32 AM, Chris Stinemetz
chrisstinem...@gmail.comwrote:

 /17|18|19|20|21+:(\d+):(\d+)+\n+\n+CELL\s+(\d+)\s+(.+?),.+?HEH/


Spot the issue:
/
 17#Or
   | 18#Or
   | 19#Or
   | 20#Or
   | 21+:(\d+):(\d+)+\n+\n+CELL\s+(\d+)\s+(.+?),.+?HEH
/x

For anything but 21, the regex is only two numbers! You need to enclose the
alternatives in () or (?:), depending on whenever you want to capture them
or not.

That aside, please be very mindful that \d and . are both code smells. The
former will match much, much more than just [0-9] -- grab the unichars[0]
program from Unicode::Tussle[1] if you want to see for yourself. Either use
the /a switch (or the more localized form (?a:), bot available in newer
Perls), or [0-9], or \p{PosixDigit}, or (your favorite way here. TIMTOWTDI
applies).

The dot is also problematic. You aren't using the /s switch, so it actually
matches [^\n]. Is that what you want? Are you certain that no one is going
to come and, after reading Perl Best Practices, will try to helpfully but
wrongly add the /smx flags and screw up your regex? If you -really- want to
match anything, use \p{Any}, or \X, and you have to know the difference
between the two, otherwise you are doing it wrong. See [2] and [3], though
you might want to make a cup of tea and sit somewhere comfortable first, as
they aren't easy nor quick reads.
But chances are that you don't want that. Which is actually much simpler! If
you want to match anything-until-the-next-comma, use [^,]+
(And if you really want [^\n], you could use \N, which is not-a-newline, or
even better, \V, which is not-a-vertical-space)

[0] https://www.metacpan.org/module/unichars
[1] https://metacpan.org/release/Unicode-Tussle
[2] http://www.nntp.perl.org/group/perl.perl5.porters/2011/07/msg174287.html
[3] http://www.nntp.perl.org/group/perl.perl5.porters/2011/07/msg174338.html

Re: regex help

2011-10-10 Thread Brian Fraser

On Mon, Oct 10, 2011 at 4:56 PM, Chris Stinemetz
chrisstinem...@gmail.comwrote:

 Any help is appreciated.

 Once I match HEH how can alter the program to print the contents that
 are in the  two lines directly above the match?

 For example in this case I would like the print results to be:

 **01 REPT:CELL 983 CDM 1, CCU 1, CE 5, HEHTimestamp: 10/10/11 00:01:18

 #!/usr/bin/perl

 use warnings;
 use strict;

 while(my $hehline = DATA) {
chomp $hehline;
if ($hehline =~ /, HEH/) {
print $hehline \n;
}
 }



 __DATA__
 10/10/11 00:01:17 #984611

 A 01 REPT:CELL 833 CP FAILURE, UNANSWERED TERMINATION
 CDMA TRAFFIC CHANNEL CONFIRMATION FAILURE
 TRAFFIC CHANNEL FAILURE REASON - ACQUIRE MOBILE FAILURE [2]
 DCS 1 TG 1723 TM 374 SG 0 ANT 2
 CARRIER 4, CHAN UNAVAIL FS-ECP ID 1, SYS ID 4681
 DN 3168710330, MIN 3164094259, IMSI UNAVAIL
 SN ###2ddff3 MEID Xa0###629cc SCM ba
 ALW CDMA, ASGN CDMA
 CDM 1, CCU 2, CE 64, MLG 1/MLG_CDM 1
   DCS 1/PSU 0/SM 3/BHS 6, ECP ID 1, SYS ID 4681



 10/10/11 00:01:18 #984614

 **01 REPT:CELL 983 CDM 1, CCU 1, CE 5, HEH
 SUPPRESSED MSGS: 0
 FT PL SECTOR 3 CARRIER 1 (1.9 GHz PCS)
  FAILURE: OUT OF RANGE
 PILOT LEVEL: MEASURED = 28.3 dBmEXPECTED = 33.8 dBm
 SECONDARY UNIT: CDM 1, CBR 3

 --
 To unsubscribe, e-mail: beginners-unsubscr...@perl.org
 For additional commands, e-mail: beginners-h...@perl.org
 http://learn.perl.org/


 Couple of ways.
You could save the line numbers on the first pass, then read the file again
and print the relevant lines; Remember that $. has the current line number.
my @lines;
while (...) {
...
if (/,\s+HEH/) {
push @lines, $.;
}
}

Or you could do the same as above, but use Tie::File instead of reading the
file twice.

Or you could keep saving the previous two lines (this assumes the HEH can't
be on the first two lines. If it can, you'll have to modify the proggy
accordingly):
my ($one, $two) = (scalar DATA, scalar DATA);
while (my $hehline = DATA) {
...
if (/,\s+HEH/) {
say [$one]\n[$two];
}
($one, $two) = ($two, $hehline);
}

Or, looking at your data, you could read paragraphs instead of
line-by-line -- Apparently each chunk is separated by three (four?)
newlines, so

{
local $/ = \n\n\n;
while (my $hehline = DATA) {
... # shenanigans here
}
}

Re: regex help

2011-10-10 Thread Chris Charley


Chris Stinemetz  wrote in message

Any help is appreciated.

Once I match HEH how can alter the program to print the contents that
are in the  two lines directly above the match?

For example in this case I would like the print results to be:

**01 REPT:CELL 983 CDM 1, CCU 1, CE 5, HEHTimestamp: 10/10/11 00:01:18



[snip code and data]

I think the following should work.

Chris

#!/usr/bin/perl
use strict;
use warnings;

my $dt; # date_time
while (DATA) {
   chomp;
   if (m!^(\d\d/\d\d/\d\d \d\d:\d\d:\d\d)!) {
   $dt = $1;
   }
   elsif (/HEH$/) {
   print $_ Timestamp $dt\n;
   }
}

__DATA__
10/10/11 00:01:17 #984611

A 01 REPT:CELL 833 CP FAILURE, UNANSWERED TERMINATION
CDMA TRAFFIC CHANNEL CONFIRMATION FAILURE
TRAFFIC CHANNEL FAILURE REASON - ACQUIRE MOBILE FAILURE [2]
DCS 1 TG 1723 TM 374 SG 0 ANT 2
CARRIER 4, CHAN UNAVAIL FS-ECP ID 1, SYS ID 4681
DN 3168710330, MIN 3164094259, IMSI UNAVAIL
SN ###2ddff3 MEID Xa0###629cc SCM ba
ALW CDMA, ASGN CDMA
CDM 1, CCU 2, CE 64, MLG 1/MLG_CDM 1
  DCS 1/PSU 0/SM 3/BHS 6, ECP ID 1, SYS ID 4681



10/10/11 00:01:18 #984614

**01 REPT:CELL 983 CDM 1, CCU 1, CE 5, HEH
SUPPRESSED MSGS: 0
FT PL SECTOR 3 CARRIER 1 (1.9 GHz PCS)
 FAILURE: OUT OF RANGE
PILOT LEVEL: MEASURED = 28.3 dBmEXPECTED = 33.8 dBm
SECONDARY UNIT: CDM 1, CBR 3

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex help

2011-10-10 Thread John W. Krahn


Chris Stinemetz wrote:

Any help is appreciated.


It looks like you don't need regex help.


Once I match HEH how can alter the program to print the contents that
are in the  two lines directly above the match?

For example in this case I would like the print results to be:

**01 REPT:CELL 983 CDM 1, CCU 1, CE 5, HEHTimestamp: 10/10/11 00:01:18

#!/usr/bin/perl

use warnings;
use strict;

while(my $hehline =DATA) {
 chomp $hehline;
 if ($hehline =~ /, HEH/) {
 print $hehline \n;
 }
}


You could simplify that by not removing the newline and then adding it 
back in:


while ( my $hehline = DATA ) {
if ( $hehline =~ /, HEH/ ) {
print $hehline;
}
}

And even simpler by not using the $hehline variable:

while ( DATA ) {
print if /, HEH/;
}

But, back to your real problem.

I can think of two ways to do it.

Number one: if you are sure that the line you want is _always_ two lines 
above you could use an array to hold the line you need:


my @buffer;
while ( my $hehline = DATA ) {
push @buffer, $hehline;
shift @buffer if @buffer  3;
if ( $hehline =~ /, HEH/ ) {
print $buffer[ 0 ];
}
}


Number two: better to just capture the line you require and only print 
it when the regular expression matches:


my $capture;
while ( my $hehline = DATA ) {
$capture = $hehline if $hehline =~ 
m{^\d+/\d+/\d+\s+\d+:\d+:\d+\s+#\d+$};

if ( $hehline =~ /, HEH/ ) {
print $capture;
}
}



John
--
Any intelligent fool can make things bigger and
more complex... It takes a touch of genius -
and a lot of courage to move in the opposite
direction.   -- Albert Einstein

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex help

2011-10-10 Thread Shawn H Corey


On 11-10-10 03:56 PM, Chris Stinemetz wrote:

Any help is appreciated.

Once I match HEH how can alter the program to print the contents that
are in the  two lines directly above the match?

For example in this case I would like the print results to be:

**01 REPT:CELL 983 CDM 1, CCU 1, CE 5, HEHTimestamp: 10/10/11 00:01:18


This is not quite what you asked for but it shows how to print lines 
before a match:


#!/usr/bin/env perl

use strict;
use warnings;

# number of lines to save
my $save_nbr = 2;

# save lines before pattern match
#   use undef for lines before beginning
my @lines = ( undef ) x $save_nbr;

while(my $hehline = DATA) {
chomp $hehline;

if ($hehline =~ /, HEH/) {

# print before lines
print $_\n for ( grep { defined } @lines );

# print current line
print $hehline\n;
}

# remove first saved line
shift @lines;

# save current line
push @lines, $hehline;
}



__DATA__
10/10/11 00:01:17 #984611

A 01 REPT:CELL 833 CP FAILURE, UNANSWERED TERMINATION
 CDMA TRAFFIC CHANNEL CONFIRMATION FAILURE
 TRAFFIC CHANNEL FAILURE REASON - ACQUIRE MOBILE FAILURE [2]
 DCS 1 TG 1723 TM 374 SG 0 ANT 2
 CARRIER 4, CHAN UNAVAIL FS-ECP ID 1, SYS ID 4681
 DN 3168710330, MIN 3164094259, IMSI UNAVAIL
 SN ###2ddff3 MEID Xa0###629cc SCM ba
 ALW CDMA, ASGN CDMA
 CDM 1, CCU 2, CE 64, MLG 1/MLG_CDM 1
   DCS 1/PSU 0/SM 3/BHS 6, ECP ID 1, SYS ID 4681



10/10/11 00:01:18 #984614

**01 REPT:CELL 983 CDM 1, CCU 1, CE 5, HEH
 SUPPRESSED MSGS: 0
 FT PL SECTOR 3 CARRIER 1 (1.9 GHz PCS)
  FAILURE: OUT OF RANGE
 PILOT LEVEL: MEASURED = 28.3 dBmEXPECTED = 33.8 dBm
 SECONDARY UNIT: CDM 1, CBR 3



--
Just my 0.0002 million dollars worth,
  Shawn

Confusion is the first step of understanding.

Programming is as much about organization and communication
as it is about coding.

The secret to great software:  Fail early  often.

Eliminate software piracy:  use only FLOSS.

Make something worthwhile.  -- Dear Hunter

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regex help

2011-10-10 Thread John Delacour


At 14:56 -0500 10/10/11, Chris Stinemetz wrote:


Once I match HEH how can alter the program to print the contents that
are in the  two lines directly above the match?


If it's only one instance you need to deal with then this should do the trick:


#!/usr/bin/perl
use strict;
my @lines;
while (DATA){
  chomp;
  s/#.*$//;
  push @lines, $_;
  last if /HEH/;
}
print $lines[-1]  Timestamp: $lines[-3];
__END__

JD

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: Regex help

2011-05-17 Thread Rob Dixon


On 16/05/2011 23:44, Owen wrote:


I am trying to get all the 6 letter names in the second field in DATA
below, eg

BARTON
DARWIN
DARWIN

But the script below gives me all 6 letter and more entries.

What I read says {6} means exactly 6.

What is the correct RE?

I have solved the problem my using if (length($data[1]) == 6 ) but
would love to know the correct syntax for the RE


=

#!/usr/bin/perl

use strict;
use warnings;

while (DATA) {
 my $line = $_;

 my @line = split /,/;
 $line[1] =~ s /\//g;

 print $line[1]\n if $line[1] =~ /\S{6}/;
}

__DATA__
0200,AUSTRALIAN NATIONAL UNIVERSITY,ACT,PO Boxes
0221,BARTON,ACT,LVR Special Mailing
0800,DARWIN,NT,,DARWIN DELIVERY CENTRE
0801,DARWIN,NT,GPO Boxes,DARWIN GPO DELIVERY ANNEXE
0804,PARAP,NT,PO Boxes,PARAP LPO
0810,ALAWA,NT,,DARWIN DELIVERY CENTRE
0810,BRINKIN,NT,,DARWIN DELIVERY CENTRE
0810,CASUARINA,NT,,DARWIN DELIVERY CENTRE
0810,COCONUT GROVE,NT,,DARWIN DELIVERY CENTRE

===


Hi Owen.

Your test establishes only whether the pattern can be found within the
object string a test like

CASUARINA =~ /\S{6}/;

finds the six non-space characters CASUAR and then returns success as
the criterion has been satisfied.

To get it to match /only/ six-character non-space strings you can add
anchors at the beginning and end of the regex:

CASUARINA =~ /^\S{6}$/;

will fail because the sequence beginning of line, six non-space
characters, end of line don't appear in CASUARINA.

But the proper way to do this is to forget about regular expressions and
treat the data as comma-separated fields. The module Text::CSV will do
this for you, as per the progrm below.

HTH,

Rob


use strict;
use warnings;

use Text::CSV;

my $csv = Text::CSV-new;

while (my $fields = $csv-getline(*DATA)) {
  my $suburb = $fields-[1];
  next unless $suburb and length $suburb == 6;
  print $suburb, \n;
}

__DATA__
0200,AUSTRALIAN NATIONAL UNIVERSITY,ACT,PO Boxes
0221,BARTON,ACT,LVR Special Mailing
0800,DARWIN,NT,,DARWIN DELIVERY CENTRE
0801,DARWIN,NT,GPO Boxes,DARWIN GPO DELIVERY ANNEXE
0804,PARAP,NT,PO Boxes,PARAP LPO
0810,ALAWA,NT,,DARWIN DELIVERY CENTRE
0810,BRINKIN,NT,,DARWIN DELIVERY CENTRE
0810,CASUARINA,NT,,DARWIN DELIVERY CENTRE
0810,COCONUT GROVE,NT,,DARWIN DELIVERY CENTRE

**OUTPUT**

BARTON
DARWIN
DARWIN


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: Regex help

2011-05-16 Thread Jim Gibson

On 5/16/11 Mon  May 16, 2011  3:44 PM, Owen rc...@pcug.org.au scribbled:

 I am trying to get all the 6 letter names in the second field in DATA
 below, eg
 
 BARTON
 DARWIN
 DARWIN
 
 But the script below gives me all 6 letter and more entries.
 
 What I read says {6} means exactly 6.

\S{6} will match any string containing 6 consecutive non-whitespace
characters. It will also match any string containing more than 6 such
characters, because any such string contains within it a substring of
exactly six characters. Perl matches do not have to match the entire string.

 
 What is the correct RE?

If you want exactly six characters, then you need to specify that any
characters before or after the wanted six are not also members of the
desired class. In your case, the easiest way is to anchor the match at the
beginning and the end:

$line[1] =~ /^\S{6}$/

If you were looking for word characters, e.g. \w, you could use the word
boundary assertion metasymbol \b:

$line[1] =~ /\b\w{6}\b/

That will not work if your names contain punctuation characters, e.g
O'Reilly. More complex matches can use the negative lookahead and lookbehind
constructs.

 
 I have solved the problem my using if (length($data[1]) == 6 ) but
 would love to know the correct syntax for the RE
 
 
 TIA
 
 
 Owen
 
 
 =
 
 #!/usr/bin/perl
 
 use strict;
 use warnings;
 
 while (DATA) {
 my $line = $_;
 
 my @line = split /,/;
 $line[1] =~ s /\//g;
 
 print $line[1]\n if $line[1] =~ /\S{6}/;
 }
 
 __DATA__
 0200,AUSTRALIAN NATIONAL UNIVERSITY,ACT,PO Boxes
 0221,BARTON,ACT,LVR Special Mailing
 0800,DARWIN,NT,,DARWIN DELIVERY CENTRE
 0801,DARWIN,NT,GPO Boxes,DARWIN GPO DELIVERY ANNEXE
 0804,PARAP,NT,PO Boxes,PARAP LPO
 0810,ALAWA,NT,,DARWIN DELIVERY CENTRE
 0810,BRINKIN,NT,,DARWIN DELIVERY CENTRE
 0810,CASUARINA,NT,,DARWIN DELIVERY CENTRE
 0810,COCONUT GROVE,NT,,DARWIN DELIVERY CENTRE
 
 ===



-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: RegEx help please ...

2008-08-11 Thread Mr. Shawn H. Corey

On Mon, 2008-08-11 at 14:02 -0700, Saya wrote:
 Hi,
 
 I have the following issue:
 
 my $s = /metadata-files/test-desc.txt,/metadata-files/birthday.txt,/
 web-media/images/bday-after-help.jpg,javascript:popUp('/pop-ups/
 birthday/main.html','bday-pics',785,460);
 
 Now I want $s to be like: /metadata-files/test-desc.txt,/metadata-
 files/birthday.txt,/web-media/images/bday-after-help.jpg,/pop-ups/
 birthday/main.html
 
 I have been working with:
 
 $s =~ s/javascript:popup\('(.*)',(.*)/$1/gi;

$s =~ s/javascript:popup\('([^']*)',(.*)/$1/gi;


-- 
Just my 0.0002 million dollars worth,
  Shawn

Where there's duct tape, there's hope.

Perl is the duct tape of the Internet.
Hassan Schroeder, Sun's first webmaster


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: RegEx help please ...

2008-08-11 Thread Rob Dixon

Saya wrote:
 
 I have the following issue:
 
 my $s = /metadata-files/test-desc.txt,/metadata-files/birthday.txt,/
 web-media/images/bday-after-help.jpg,javascript:popUp('/pop-ups/
 birthday/main.html','bday-pics',785,460);
 
 Now I want $s to be like: /metadata-files/test-desc.txt,/metadata-
 files/birthday.txt,/web-media/images/bday-after-help.jpg,/pop-ups/
 birthday/main.html
 
 I have been working with:
 
 $s =~ s/javascript:popup\('(.*)',(.*)/$1/gi;
 
 But this gives me $s looking like this: /metadata-files/levemir-
 desc.txt,/metadata-files/levemir-keywords.txt,/web-media/images/
 img_insulin_interactive.jpg,/pop-ups/why-insulin/
 main.html','quickguide'
 
 How can I only I achieve what I am trying to ?
 Any help or hints will be greatly appreciated.

  $s =~ s/javascript:popUp\('(.*?)'.*/$1/;

does what you want. But without seeing all of the possible data you have I can't
be sure that it will work in every case.

The main mistake you made was to use a greedy capture /(.*)/ which will match up
to the last single-quote in the string, instead of a non-greedy one /(.*?)/
which will match only up to the next single-quote. You also have an unnecessary
capture around the trailing /.*/ which is wasteful but will not cause the
substitution to fail.

HTH,

Rob

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: RegEx help please ...

2008-08-11 Thread Rob Dixon

Mr. Shawn H. Corey wrote:
 On Mon, 2008-08-11 at 14:02 -0700, Saya wrote:

 I have the following issue:

 my $s = /metadata-files/test-desc.txt,/metadata-files/birthday.txt,/
 web-media/images/bday-after-help.jpg,javascript:popUp('/pop-ups/
 birthday/main.html','bday-pics',785,460);

 Now I want $s to be like: /metadata-files/test-desc.txt,/metadata-
 files/birthday.txt,/web-media/images/bday-after-help.jpg,/pop-ups/
 birthday/main.html

 I have been working with:

 $s =~ s/javascript:popup\('(.*)',(.*)/$1/gi;
 
 $s =~ s/javascript:popup\('([^']*)',(.*)/$1/gi;

There's no reason to capture $2 and not use it.

The global substitution is also unlikely to be correct.

Rob

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: RegEx help please ...

2008-08-11 Thread Mr. Shawn H. Corey

On Mon, 2008-08-11 at 23:30 +0100, Rob Dixon wrote:
 Mr. Shawn H. Corey wrote:
  On Mon, 2008-08-11 at 14:02 -0700, Saya wrote:
 
  I have the following issue:
 
  my $s = /metadata-files/test-desc.txt,/metadata-files/birthday.txt,/
  web-media/images/bday-after-help.jpg,javascript:popUp('/pop-ups/
  birthday/main.html','bday-pics',785,460);
 
  Now I want $s to be like: /metadata-files/test-desc.txt,/metadata-
  files/birthday.txt,/web-media/images/bday-after-help.jpg,/pop-ups/
  birthday/main.html
 
  I have been working with:
 
  $s =~ s/javascript:popup\('(.*)',(.*)/$1/gi;
  
  $s =~ s/javascript:popup\('([^']*)',(.*)/$1/gi;
 
 There's no reason to capture $2 and not use it.
 
 The global substitution is also unlikely to be correct.
 
 Rob
 

You are making too many assumptions.  The OP only posted one line of
code.  That does not mean that $2 is not used in the next, in which case
it should be captured.  And the OP only posted one example.  The real
data may have more than one match.

Isn't one of the guidelines for this list is to prune code that has no
bearing on the problem?

Don't assume that only what is posted is the whole story.


-- 
Just my 0.0002 million dollars worth,
  Shawn

Where there's duct tape, there's hope.

Perl is the duct tape of the Internet.
Hassan Schroeder, Sun's first webmaster


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: RegEx help please ...

2008-08-11 Thread Rob Dixon

Mr. Shawn H. Corey wrote:
 On Mon, 2008-08-11 at 23:30 +0100, Rob Dixon wrote:
 Mr. Shawn H. Corey wrote:
 On Mon, 2008-08-11 at 14:02 -0700, Saya wrote:

 I have the following issue:

 my $s = /metadata-files/test-desc.txt,/metadata-files/birthday.txt,/
 web-media/images/bday-after-help.jpg,javascript:popUp('/pop-ups/
 birthday/main.html','bday-pics',785,460);

 Now I want $s to be like: /metadata-files/test-desc.txt,/metadata-
 files/birthday.txt,/web-media/images/bday-after-help.jpg,/pop-ups/
 birthday/main.html

 I have been working with:

 $s =~ s/javascript:popup\('(.*)',(.*)/$1/gi;

 $s =~ s/javascript:popup\('([^']*)',(.*)/$1/gi;

 There's no reason to capture $2 and not use it.

 The global substitution is also unlikely to be correct.
 
 You are making too many assumptions.  The OP only posted one line of
 code.  That does not mean that $2 is not used in the next, in which case
 it should be captured.  And the OP only posted one example.  The real
 data may have more than one match.
 
 Isn't one of the guidelines for this list is to prune code that has no
 bearing on the problem?
 
 Don't assume that only what is posted is the whole story.

What is posted cannot possibly be the whole story, and I qualified my answer in
my response. I consider it extremely unlikely that the string

  q~'bday-pics',785,460);~

in $2 is wanted later in the program because I cannot conceive of a likely use
of the /last/ such capture in conjunction with a global substitution. If you are
contending that we cannot assume anything at all about the unseen part of the
program then that precludes almost any useful response altogether.

I believe we should make our best guess about the likely context of the question
and declare any tentative assumptions. If you think the second capture in the
regex and the /g flag on the substitution were probably necessary then I
disagree completely. If you agree with me that they were probably unnecessary
but stuck them in there anyway without comment then I also disagree with you.

Rob

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2008-07-24 Thread Mr. Shawn H. Corey

On Thu, 2008-07-24 at 09:44 -0400, Tony Heal wrote:
 I have a text dump of a postgresql database and I want to find out if there
 are any characters that are not standard keyboard characters. Is there a way
 to use regex to do this without doing a character by character scan of a 5GB
 file. 

No.

 I want to know where any character that is not one of these is in the
 file: a-z A-Z 0-9 [EMAIL PROTECTED] mailto:[EMAIL 
 PROTECTED]*()[]{};:',./?|\
 *()[]{};:',./?|\


See `perldoc POSIX` and search for 'isalpha'


-- 
Just my 0.0002 million dollars worth,
  Shawn

Where there's duct tape, there's hope.

Perl is the duct tape of the Internet.
Hassan Schroeder, Sun's first webmaster


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2008-06-21 Thread Dr.Ruud

Ravi Malghan schreef:

 I want to split this string into an array using comma seperator, but
 the problem is some values have one or more commas within them.

That is a common problem. First split on comma, then recombine elements
by using out-of-band knowledge.

-- 
Affijn, Ruud

Gewoon is een tijger.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2008-06-21 Thread Rob Dixon

Rob Dixon wrote:
 Ravi Malghan wrote:
 Hi: I am trying to extract some stuff from a string and not getting the 
 expected results. I have looked through 
 http://www.perl.com/doc/manual/html/pod/perlre.html and can't seem to figure 
 this one out.

 I have a string which is a sequence of words and each item is comma seperated
 field1, lengthof value1, value1,field2, length of value2, 
 value2,field3,length of value3, value3 and so on

 After each field name I have the length of the value

 I want to split this string into an array using comma seperator, but the 
 problem is some values have one or more commas within them.

 so for example my string might look like this

 $origString = EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note, 
 with some commas, and more commas,ADDR,3515421 Test Lane, Rockville, MD, 
 USA,ESCALATION-LVL,1,0

 My current code goes character by character and constructs what I want. But
 is very slow when this string gets large.
 
 The program below will do what you describe.

Here's an improvement that explains when it doesn't find values that it expects.

Rob


use strict;
use warnings;

my $origString = EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note,
with some commas, and more commas,ADDR,35,15421 Test Lane, Rockville, MD,
USA,ESCALATION-LVL,1,0;

while() {

  $origString =~ /\G([^,]+),(\d+),/g or die No field name / size found;
  my ($field, $size) = ($1, $2);

  $origString =~ /\G(.{$size})/g or die Insufficient characters for field 
size;
  my $value = $1;

  printf %s (%d) - %s\n, $field, $size, $value;

  $origString =~ /\G,/g or last;
}

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2008-06-20 Thread yitzle

On Fri, Jun 20, 2008 at 3:10 PM, Ravi Malghan [EMAIL PROTECTED] wrote:
 Hi: I am trying to extract some stuff from a string and not getting the 
 expected results. I have looked through 
 http://www.perl.com/doc/manual/html/pod/perlre.html and can't seem to figure 
 this one out.
 I have a string which is a sequence of words and each item is comma seperated
 field1, lengthof value1, value1,field2, length of value2, 
 value2,field3,length of value3, value3 and so on
 After each field name I have the length of the value
 I want to split this string into an array using comma seperator, but the 
 problem is some values have one or more commas within them.
 so for example my string might look like this
 $origString = EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note, 
 with some commas, and more commas,ADDR,3515421 Test Lane, Rockville, MD, 
 USA,ESCALATION-LVL,1,0
 My current code goes character by character and constructs what I want. But 
 is very slow when this string gets large.
 TIA
 Ravi

My solution:

use strict;
use warnings;

my $origString = EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a
test note, with some commas, and more commas,ADDR,3515421 Test Lane,
Rockville, MD, USA,ESCALATION-LVL,1,0;

my @arr = split (/,/, $origString);
# print join (\n, @arr); exit;

while ( scalar @arr ) {

my $field = shift @arr;
last unless ( defined $field );

my $vlength = shift @arr;
last unless ( defined $vlength );
unless ( $vlength =~ /^\d+$/ ) {
die Invalid length: [$vlength]\n;
}

my $value = ;
while ( length ( $value )  $vlength ) {
my $bit = shift @arr;
last unless ( defined $bit );

$value .= , if ( length $value );
$value .= $bit;
}
print $field - $value\n;
}

Time it?

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2008-06-20 Thread John W. Krahn


Ravi Malghan wrote:

Hi: I am trying to extract some stuff from a string and not getting the 
expected results. I have looked through 
http://www.perl.com/doc/manual/html/pod/perlre.html and can't seem to figure 
this one out.
I have a string which is a sequence of words and each item is comma seperated
field1, lengthof value1, value1,field2, length of value2, value2,field3,length 
of value3, value3 and so on
After each field name I have the length of the value
I want to split this string into an array using comma seperator, but the 
problem is some values have one or more commas within them.
so for example my string might look like this
$origString = EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note, with some 
commas, and more commas,ADDR,3515421 Test Lane, Rockville, MD, 
USA,ESCALATION-LVL,1,0
My current code goes character by character and constructs what I want. But is 
very slow when this string gets large.


$ perl -le'
my $origString = EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test 
note, with some commas, and more commas,ADDR,3515421 Test Lane, 
Rockville, MD, USA,ESCALATION-LVL,1,0;


while ( $origString =~ /([^,]+),(\d+),/g ) {
print for $1, $2, substr $origString, pos( $origString ), $2;
}
'
EMPLID
4
9066
USERID
7
W3LWEB1
TEXT
54
This is a test note, with some commas, and more commas
ESCALATION-LVL
1
0



John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2008-06-20 Thread Rob Dixon

Ravi Malghan wrote:
 
 Hi: I am trying to extract some stuff from a string and not getting the 
 expected results. I have looked through 
 http://www.perl.com/doc/manual/html/pod/perlre.html and can't seem to figure 
 this one out.
 
 I have a string which is a sequence of words and each item is comma seperated
 field1, lengthof value1, value1,field2, length of value2, 
 value2,field3,length of value3, value3 and so on
 
 After each field name I have the length of the value
 
 I want to split this string into an array using comma seperator, but the 
 problem is some values have one or more commas within them.
 
 so for example my string might look like this
 
 $origString = EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note, 
 with some commas, and more commas,ADDR,3515421 Test Lane, Rockville, MD, 
 USA,ESCALATION-LVL,1,0
 
 My current code goes character by character and constructs what I want. But
 is very slow when this string gets large.

The program below will do what you describe.

HTH,

Rob


use strict;
use warnings;

my $origString = EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note,
with some commas, and more commas,ADDR,3515421 Test Lane, Rockville, MD,
USA,ESCALATION-LVL,1,0;

while() {

  $origString =~ /\G([^,]+),/g or last;
  my $field = $1;

  $origString =~ /\G(\d+),/g or last;
  my $size = $1;

  $origString =~ /\G(.{$size}),?/g or last;
  my $value = $1;

  printf %s(%d) - %s\n, $field, $size, $value;
}

**OUTPUT**

EMPLID(4) - 9066
USERID(7) - W3LWEB1
TEXT(54) - This is a test note, with some commas, and more commas



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2008-06-20 Thread Gunnar Hjalmarsson


Ravi Malghan wrote:
I have a string which is a sequence of words and each item is comma 
seperated
field1, lengthof value1, value1,field2, length of value2, 
value2,field3,length of value3, value3 and so on

After each field name I have the length of the value
I want to split this string into an array using comma seperator, but 
the problem is some values have one or more commas within them.


Okay. There is a missing comma between ADDR,35 and 15421, right? 
Under that assumption, I believe this code gets what you want:


C:\hometype test.pl
my $origString = EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,
. This is a test note, with some commas, and more commas,
. ADDR,35,15421 Test Lane, Rockville, MD, USA,ESCALATION-LVL,1,0;

my @parts = split /([A-Z-]+),(\d+)/, $origString;
shift @parts;
while ( my $k = shift @parts ) {
my $length = shift @parts;
print $k = , substr( shift @parts, 1, $length ), \n;
}

C:\hometest.pl
EMPLID = 9066
USERID = W3LWEB1
TEXT = This is a test note, with some commas, and more commas
ADDR = 15421 Test Lane, Rockville, MD, USA
ESCALATION-LVL = 0

C:\home

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2008-06-20 Thread icarus

On Jun 20, 9:10 am, [EMAIL PROTECTED] (Ravi Malghan) wrote:
 Hi: I am trying to extract some stuff from a string and not getting the 
 expected results. I have looked 
 throughhttp://www.perl.com/doc/manual/html/pod/perlre.html and can't seem to 
 figure this one out.
 I have a string which is a sequence of words and each item is comma seperated
 field1, lengthof value1, value1,field2, length of value2, 
 value2,field3,length of value3, value3 and so on
 After each field name I have the length of the value
 I want to split this string into an array using comma seperator, but the 
 problem is some values have one or more commas within them.
 so for example my string might look like this
 $origString = EMPLID,4,9066,USERID,7,W3LWEB1,TEXT,54,This is a test note, 
 with some commas, and more commas,ADDR,3515421 Test Lane, Rockville, MD, 
 USA,ESCALATION-LVL,1,0
 My current code goes character by character and constructs what I want. But 
 is very slow when this string gets large.
 TIA
 Ravi



 I want to split this string into an array using comma seperator, but the 
 problem is some values have one or more commas within them [..]
snip
 My current code goes character by character and constructs what I want. But 
 is very slow when this string gets large.
Post your code or relevant portion.  Otherwise we might
repeating here stuff what you've done or tried already.

   Is there any way you can use another delimiter such as
tildes ~ or something? If you tweak to accept other delimiters that
would be easier to treat.  If you cannot, you could use regex to find
the next alpha_num character of the string and put those into an
array,

 \w  Match a word character (alphanumeric plus _)
 \W  Match a non-word character
\b  Match a word boundary
\B  Match a non-(word boundary)

 or find out exactly the number of commas it may have and weed them
out...
 *  Match 0 or more times
+  Match 1 or more times
?  Match 1 or 0 times
{n}Match exactly n times
{n,}   Match at least n times
{n,m}  Match at least n but not more than m times
   etc..

 But again, post your code so we don't overlap...


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help?

2008-05-07 Thread John W. Krahn


sanket vaidya wrote:

HI all,


Hello,


Kindly go through the code below.

use warnings;
use strict;
my $i=1;
while($i=10)
{
$_ = abcpqr;
$_=~ s/(?=pqr)/$i/;
print $_\n;
$i++;
}

Output:
abc1pqr
abc2pqr
abc3pqr
abc4pqr
abc5pqr
abc6pqr
abc7pqr
abc8pqr
abc9pqr
abc10pqr

The expected output is

abc001pqr
abc002pqr
abc003pqr
abc004pqr
abc005pqr
abc006pqr
abc007pqr
abc008pqr
abc009pqr
abc010pqr

Can any one suggest me how to get that output using regex. i.e. Can this
happen by making change in regex I used in code??


$ perl -e'
use warnings;
use strict;
for my $i ( 1 .. 10 ) {
$_ = abcpqr;
$_ =~ s/(?=pqr)/sprintf q[%03d], $i/e;
print $_\n;
}
'
abc001pqr
abc002pqr
abc003pqr
abc004pqr
abc005pqr
abc006pqr
abc007pqr
abc008pqr
abc009pqr
abc010pqr


Or maybe this would be better:

$ perl -e'
use warnings;
use strict;
$_ = abcpqr;
$_ =~ s/(?=pqr)/%03d/;
for my $i ( 1 .. 10 ) {
printf $_\n, $i;
}
'
abc001pqr
abc002pqr
abc003pqr
abc004pqr
abc005pqr
abc006pqr
abc007pqr
abc008pqr
abc009pqr
abc010pqr



John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help?

2008-05-07 Thread Dr.Ruud

sanket vaidya schreef:

 use warnings;
 use strict;
 my $i=1;
 while($i=10)
 {
 $_ = abcpqr;
 $_=~ s/(?=pqr)/$i/;
 print $_\n;
 $i++;
 }
 [...]
 
 The expected output is
 abc001pqr
 [...]
 Can any one suggest me how to get that output using regex. i.e. Can
 this happen by making change in regex I used in code??

Why use a regular expression, or even a substitution? 

perl -Mstrict -Mwarnings -e'
printf q{abc%03dpqr%s}, $_, $/ for 1..3;
'
abc001pqr
abc002pqr
abc003pqr

-- 
Affijn, Ruud

Gewoon is een tijger.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2007-11-25 Thread Todd

See the codes below. The trick here is that $ is an special var used
to capture the total match string.
So in this case, the /[$]/ regexp literal is equal to /
[.type=xmlrpc]/.

#!/bin/perl
$url = 'http://abc.com/test.cgi?TEST=1.type=xmlrpctype=2';
($r1) = $url =~ /\.type=(.+?)(|$)/;
print \$=$\n;
($r2) = $url =~ /\.type=(.+?)[$]/;

print \$r1=$r1\n\$r2=$r2\n

__DATA__
$=.type=xmlrpc
$r1=xmlrpc
$r2=x


-Todd


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2007-11-25 Thread John W . Krahn

On Saturday 24 November 2007 22:59, Todd wrote:
 See the codes below. The trick here is that $ is an special var used
 to capture the total match string.
 So in this case, the /[$]/ regexp literal is equal to /
 [.type=xmlrpc]/.

Right.  And that is a character class which says to match *one* 
character, either '.' or '=' or '' or 'c' or 'e' or 'l' or 'm' or 'p' 
or 'r' or 't' or 'x' or 'y'.


 #!/bin/perl
 $url = 'http://abc.com/test.cgi?TEST=1.type=xmlrpctype=2';
 ($r1) = $url =~ /\.type=(.+?)(|$)/;
 print \$=$\n;
 ($r2) = $url =~ /\.type=(.+?)[$]/;

 print \$r1=$r1\n\$r2=$r2\n

 __DATA__
 $=.type=xmlrpc
 $r1=xmlrpc
 $r2=x

Which is why $r2=x because (.+?) matches the 'x' after '.type=' and 
[.type=xmlrpc] matches the 'm' after '.type=x'.

Also, using $ (or $` or $') slows down all regular expressions in the 
program.



John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2007-11-24 Thread Rob Dixon


howa wrote:

Hello,

I want to match a query string,

e.g.

http://abc.com/test.cgi?TEST=1.type=xmlrpctype=2

I want to extract the .type, currently i use

$ENV{QUERY_STRING} =~ /\.type=(.+?)(|$)/; # This is okay

but back reference seem no good, i rewrite as

$ENV{QUERY_STRING} =~ /\.type=(.+?)[$]/;  # seem better, but not
working

any idea?


Unless you are committed to using a regex, a nicer idea may be to use
the excellent URI modules, as in the program below.

HTH,

Rob


use strict;
use warnings;

use URI;
use URI::QueryParam;

my $uri = URI-new('http://abc.com/test.cgi?TEST=1.type=xmlrpctype=2');

print $uri-query_param('.type');

**OUTPUT**

xmlrpc


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2007-11-23 Thread John W . Krahn

On Thursday 22 November 2007 23:23, howa wrote:
 Hello,

Hello,

 I want to match a query string,

 e.g.

 http://abc.com/test.cgi?TEST=1.type=xmlrpctype=2

 I want to extract the .type, currently i use

 $ENV{QUERY_STRING} =~ /\.type=(.+?)(|$)/; # This is okay

 but back reference seem no good,

Why does it seem no good?  Perhaps you should use non-capturing 
parentheses instead.

Or maybe this would work better:

$ENV{ QUERY_STRING } =~ /\.type=([^]+)/;


 i rewrite as

 $ENV{QUERY_STRING} =~ /\.type=(.+?)[$]/;  # seem better, but not
 working

[$] is a character class that matches either the '$' character or the 
'' character.  In the previous example '$' in a pattern but not in a 
character class is a meta-character that matches at end-of-line, not 
the literal '$' character.



John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-21 Thread Matthew Whipple

Omega -1911 wrote:
 which isn't an equivalent to yours - it simply makes sure that the
 record contains 'Powerball:' and at least one digit - but I'm sure it is
 adequate. My own solution didn't even do this much checking, since I
 read the OP as saying that all irrelevant data records had been removed.
 


 I appreciate the help as I am understanding the examples, but when I
 ran Dr. Rudd's example, I had weird data in the common array (Notice
 the number 440):

   
I'd guess the first example didn't include the conditional checking the
data format and there were some 440 HTTP errors while retrieving the
data (obviously hinging on whether that applies to the data source,
especially since it was initially specified as a web page rather than a
web site). 
 common : 120, 10, 07, 440, 6, 120, 7, 07, 440, 22, 120, 3, 07, 440, 1,
 120, 31, 07, 440, 6, 120, 27, 07, 440, 13, 120, 24, 07, 440, 10, 120,
 20, 07, 440, 10, 120, 17, 07, 440, 14, 120, 13, 07, 440, 21, 120, 10,
 07, 440, 12, 120, 6, 07, 440, 8, 120, 3, 07, 440, 2, 120, 29, 07, 440,
 31, 120, 26, 07, 440, 25, 120, 22, 07, 440, 4, 120, 19, 07, 440, 20,
 120, 15, 07, 440, 13, 120, 12, 07, 440, 5, 120, 8, 07, 440, 7, 120, 5,
 07, 440, 11, 120, 1, 07, 440, 12, 120, 29, 07, 440, 13, 120, 25, 07,
 440, 2, 120, 22, 07, 440, 12, 120, 18, 07, 440, 12, 120, 15, 07, 440,
 19, 120, 11, 07, 440, 1, 120, 8, 07, 440, 9, 120, 4, 07, 440, 2, 120,
 1, 07, 440, 9, 120, 28, 07, 440, 15, 120, 25, 07, 440, 28, 120, 21,
 07, 440, 14, 120, 18, 07, 440, 3, 120, 14, 07, 440, 1, 120, 11, 07,
 440, 8, 120, 7, 07, 440, 15, 120, 4, 07, 440, 1, 120, 30, 07, 440, 24,
 120, 27, 07, 440, 9, 120, 23, 07, 440, 14, 120, 20, 07, 440, 23, 120,
 16, 07, 440, 4, 120, 13, 07, 440, 10, 120, 9, 07, 440, 7, 120, 6, 07,
 440, 5, 120, 2, 07, 440, 2, 120, 30, 07, 440, 7, 120, 26, 07, 440, 1,
 120, 23, 07, 440, 3, 120, 19, 07, 440, 3, 120, 16, 07, 440, 6, 120,
 12, 07, 440, 30, 120, 9, 07, 440, 2, 120, 5, 07, 440, 13, 120, 2, 07,
 440, 1, 120, 28, 07, 440, 16, 120, 25, 07, 440, 12, 120, 21, 07, 440,
 22, 120, 18, 07, 440, 6, 120, 14, 07, 440, 12, 120, 11, 07, 440, 6,
 120, 7, 07, 440, 2, 120, 4, 07, 440, 19, 120, 31, 07, 440, 2, 120, 28,
 07, 440, 6, 120, 24, 07, 440, 10, 120, 21, 07, 440, 16, 120, 17, 07,
 440, 7, 120, 14, 07, 440, 4, 120, 10, 07, 440, 14, 120, 7, 07, 440,
 13, 120, 3, 07, 440, 1, 120, 28, 07, 440, 13, 120, 24, 07, 440, 36,
 120, 21, 07, 440, 2, 120, 17, 07, 440, 1, 120, 14, 07, 440, 3, 120,
 10, 07, 440, 2, 120, 7, 07, 440, 4, 120, 3, 07, 440, 12, 120, 31, 07,
 440, 2, 120, 27, 07, 440, 10, 120, 24, 07, 440, 9, 120, 20, 07, 440,
 1, 120, 17, 07, 440, 16, 120, 13, 07, 440, 1, 120, 10, 07, 440, 36,
 120, 6, 07, 440, 1, 120, 3, 07, 440, 10, 120, 30, 06, 440, 9, 120, 27,
 06, 440, 14, 120, 23, 06, 440, 8, 120, 20, 06, 440, 1, 120, 16, 06,
 440, 5, 120, 13, 06, 440, 19, 120, 9, 06, 440, 19, 120, 6, 06, 440, 7,
 120, 2, 06, 440, 17, 120, 29, 06, 440, 2, 120, 25, 06, 440, 5, 120,
 22, 06, 440, 22, 120, 18, 06, 440, 1, 120, 15, 06, 440, 11, 120, 11,
 06, 440, 35 powerball : 22, 29, 31, 16, 25, 11, 11, 15, 30, 16, 30, 4,
 33, 27, 9, 25, 16, 24, 12, 20, 19, 16, 8, 37, 15, 22, 10, 16, 23, 16,
 19, 35, 30, 9, 21, 20, 21, 2, 38, 11, 15, 31, 8, 13, 10, 9, 22, 23, 5,
 11, 19, 7, 44, 13, 21, 8, 22, 13, 26, 10, 21, 15, 28, 30, 5, 20, 38,
 24, 17, 27, 18, 16, 5, 20, 38, 8, 15, 26, 11, 22, 13, 15, 19, 19, 5,
 35, 21, 42, 24, 12, 23, 16, 27, 6, 17, 32, 22, 34, 34, 8, 18, 32, 8,
 28, 38

 BUT, when I run his other example (see below), everything worked as
 well as the other examples you all supplied:

 while (DATA) {
 if (my @numbers =
   /(\d+), (\d+), (\d+), (\d+), (\d+), Powerball: (\d+)/) {
   push @common, @numbers[0..4];
   push @powerball, $numbers[5];
 }
 else {
   ...
 }
   }

   


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-19 Thread Jenda Krynicky

From: Dr.Ruud [EMAIL PROTECTED]
 Jenda Krynicky schreef:
  {
my $static;
sub foo {
  $static++;
  ...
}
  }
 
 There (the first declared version of) the variable $static is part of
 the environment of foo(). Don't mistake that for staticness.

Maybe I don't know what does staticness mean then. I though a 
static variable is one that is private to a function, but keeps the 
value between the function's invocations. How do you define 
staticness?

 In Perl 5.8.8 you can enforce $static to be static, like this:
 
 {
   0 and my $static;
   sub foo {
 $static++;
 ...
   }
 }
 
 That ugly my() can only occur once, ut it still makes the variable
 lexical.
 There is just no better way to set up a real static variable in Perl
 5.8.8.
 
 
 Check out the differences between the following two academic examples:
 
 $ perl -le'
for (7..9)
{
  my $static = $_;  # declared and initialised 3 times
 
  sub foo {
$static++;  # uses the first of the declared $static's
print   foo:$static;
  }
  foo() for 0..1;
  print for:$static;
}
 '

With -w you get a Variable $static will not stay shared warning. 
And rightly so. You are doing something you are not supposed to do.

A named subroutine inside another subroutine or a loop is a red flag. 
Something that (unless found in an obfuscation) suggests that the 
author of the code misunderstood something. It's yet another please 
don't do this.

$ perl -le'
   for (7..9)
   {
 my $static = $_;  # declared and initialised 3 times

 my $foo = sub {
   $static++;  # uses the first of the declared $static's
   print   foo:$static;
 };
 $foo-() for 0..1;
 print for:$static;
   }
'

Jenda
= [EMAIL PROTECTED] === http://Jenda.Krynicky.cz =
When it comes to wine, women and song, wizards are allowed 
to get drunk and croon as much as they like.
-- Terry Pratchett in Sourcery


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-13 Thread Dr.Ruud

Jenda Krynicky schreef:

 if
 
   0 and my $x;
 
 creates a static $x I call it a bug.

It's called a feature. 

-- 
Affijn, Ruud

Gewoon is een tijger.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-13 Thread Dr.Ruud

Jenda Krynicky schreef:

 I'd definitely never ever do

   condition and my $x = blah();

That is what I said. It is technically OK to use it with a condition
that can not be decided at compile time, but I still recommend not to
use it.

-- 
Affijn, Ruud

Gewoon is een tijger.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-13 Thread Dr.Ruud

Jenda Krynicky schreef:

 {
   my $static;
   sub foo {
 $static++;
 ...
   }
 }

There (the first declared version of) the variable $static is part of
the environment of foo(). Don't mistake that for staticness.


In Perl 5.8.8 you can enforce $static to be static, like this:

{
  0 and my $static;
  sub foo {
$static++;
...
  }
}

That ugly my() can only occur once, ut it still makes the variable
lexical.
There is just no better way to set up a real static variable in Perl
5.8.8.


Check out the differences between the following two academic examples:

$ perl -le'
   for (7..9)
   {
 my $static = $_;  # declared and initialised 3 times

 sub foo {
   $static++;  # uses the first of the declared $static's
   print   foo:$static;
 }
 foo() for 0..1;
 print for:$static;
   }
'
  foo:8
  foo:9
for:9
  foo:10
  foo:11
for:8 (would be undef without the initialisation)
  foo:12
  foo:13
for:9 (would be undef without the initialisation)


$ perl -le'
   for (7..9)
   {
 0 and my $static = $_;  # declared *once*,
 # *never* initialised
 sub foo {
   $static++;
   print   foo:$static;
 }
 foo() for 0..1;
 print for:$static;
   }
'
  foo:1
  foo:2
for:2
  foo:3
  foo:4
for:4
  foo:5
  foo:6
for:6

-- 
Affijn, Ruud

Gewoon is een tijger.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-12 Thread Jenda Krynicky

From: Dr.Ruud [EMAIL PROTECTED]
 Rob Dixon schreef:
  Dr.Ruud:
  John W . Krahn:
 
 /Powerball:/ and my @numbers = /\d+/g;
 
  I wouldn't use such a conditional my.
 
  There is no conditional 'my': it is a de[c]laration.
 
 I call it a conditional my. A my can be just a declaration, or a
 declaration and an initialisation. In this case only the initialisation
 is conditional.
 
 A my in a condition has special behaviour if the condition is constant
 false: 0 and my $var; creates a static $var.
 
 As I wrote: *I* wouldn't use *such* a conditional my. I put the
 declaration on its own line, just before the conditional initialisation.
 
 I sometimes use a conditional my if I want the static behaviour, but not
 in production code. Perl 5.10 has static.

Perl 5.x has

{
  my $static;
  sub foo {
$static++;
...
  }
}

which even lets you create variables that are shared by several 
subroutines.

I do understand you might want to use my() like this:

  open my $FH, '', $filename or die $^E;

or

  if (my $foo = foo($x, $y, $z) and my $bar = bar(1,2,3)) {
and use $foo and $bar here
  }

but I'd definitely never ever do

  condition and my $x = blah();

and if

  0 and my $x;

creates a static $x I call it a bug. 

Jend
= [EMAIL PROTECTED] === http://Jenda.Krynicky.cz =
When it comes to wine, women and song, wizards are allowed 
to get drunk and croon as much as they like.
-- Terry Pratchett in Sourcery


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-11 Thread John W . Krahn

On Saturday 10 November 2007 06:39, Dr.Ruud wrote:
 Jonathan Lang schreef:
while (DATA) {
  ($a[0], $a[1], $a[2], $a[3], $a[4], $b) = /(\d+), (\d+), (\d+),
  (\d+), (\d+), Powerball: (\d+)/;
  push @common, @a; push @powerball, $b;
}

 A slightly different way to do that, is:

while (DATA) {
  if (my @numbers =
/(\d+), (\d+), (\d+), (\d+), (\d+), Powerball: (\d+)/) {

Another way to do that:

   /Powerball:/ and my @numbers = /\d+/g;


push @common, @numbers[0..4];
push @powerball, $numbers[5];
  }
  else {
...
  }
}



John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-11 Thread Dr.Ruud

John W . Krahn schreef:
 Dr.Ruud:
 Jonathan Lang:

   while (DATA) {
 ($a[0], $a[1], $a[2], $a[3], $a[4], $b) = /(\d+), (\d+), (\d+),
 (\d+), (\d+), Powerball: (\d+)/;
 push @common, @a; push @powerball, $b;
   }
 
 A slightly different way to do that, is:
 
while (DATA) {
  if (my @numbers =
/(\d+), (\d+), (\d+), (\d+), (\d+), Powerball: (\d+)/) {
 
 Another way to do that:
 
/Powerball:/ and my @numbers = /\d+/g;
 
 
push @common, @numbers[0..4];
push @powerball, $numbers[5];
  }
  else {
...
  }
}

I wouldn't use such a conditional my.

So maybe you meant it more like:

if ( /Powerball:/ ) {
if ( (my @numbers = /\d+/g) = 5 ) {
push @common, @numbers[0..4];
push @powerball, $numbers[5];
}
else {

}
}
else {
...
}


For example:

#!/usr/bin/perl
use strict;
use warnings;

{ local ($, $\) = (, , \n);

  my @common;
  my @powerball;

  while (DATA) {
  if ( /Powerball:/ ) {
  if ( (my @numbers = /\b\d+\b/g)  5 ) {
  push @common, @numbers[0..4];
  push @powerball, $numbers[5];
  }
  else {
  print  EOS;

*
* ERROR * parsing input line $.
*
EOS
  }
  }
  else {
  # do nothing
  }
  }

  print common: @common;
  print powerball : @powerball;
}

__DATA__
abc 01 def 02 ghi 03 ijk 04 lmn 05 Powerbalx: 06 xyz
abc 11 def 12 ghi 13 ijk 14 lmn 15 Powerball: 16 xyz
abc 21 def 22 ghi 23 ijk 24 lmn 25 Powerball: X6 xyz
abc 31 def 32 ghi 33 ijk 34 lmn 35 Powerball: 36 xyz
test
abc 41 def 42 ghi 43 ijk 44 lmn 45 Powerball: 46.3 xyz

-- 
Affijn, Ruud

Gewoon is een tijger.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-11 Thread Rob Dixon


Dr.Ruud wrote:

John W . Krahn schreef:

Dr.Ruud:

Jonathan Lang:



  while (DATA) {
($a[0], $a[1], $a[2], $a[3], $a[4], $b) = /(\d+), (\d+), (\d+),
(\d+), (\d+), Powerball: (\d+)/;
push @common, @a; push @powerball, $b;
  }

A slightly different way to do that, is:

   while (DATA) {
 if (my @numbers =
   /(\d+), (\d+), (\d+), (\d+), (\d+), Powerball: (\d+)/) {

Another way to do that:

   /Powerball:/ and my @numbers = /\d+/g;



   push @common, @numbers[0..4];
   push @powerball, $numbers[5];
 }
 else {
   ...
 }
   }


I wouldn't use such a conditional my.

So maybe you meant it more like:

if ( /Powerball:/ ) {
if ( (my @numbers = /\d+/g) = 5 ) {
push @common, @numbers[0..4];
push @powerball, $numbers[5];
}
else {

}
}
else {
...
}


There is no conditional 'my': it is a delaration. I believe John was 
suggesting a replacement just for your conditional expression:


  if (/Powerball:/ and my @numbers = /\d+/g) {
push @common, @numbers[0..4];
push @powerball, $numbers[5];
  }
  else {
:
  }

which isn't an equivalent to yours - it simply makes sure that the 
record contains 'Powerball:' and at least one digit - but I'm sure it is 
adequate. My own solution didn't even do this much checking, since I 
read the OP as saying that all irrelevant data records had been removed.


Rob

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-11 Thread Omega -1911

 which isn't an equivalent to yours - it simply makes sure that the
 record contains 'Powerball:' and at least one digit - but I'm sure it is
 adequate. My own solution didn't even do this much checking, since I
 read the OP as saying that all irrelevant data records had been removed.


I appreciate the help as I am understanding the examples, but when I
ran Dr. Rudd's example, I had weird data in the common array (Notice
the number 440):

common : 120, 10, 07, 440, 6, 120, 7, 07, 440, 22, 120, 3, 07, 440, 1,
120, 31, 07, 440, 6, 120, 27, 07, 440, 13, 120, 24, 07, 440, 10, 120,
20, 07, 440, 10, 120, 17, 07, 440, 14, 120, 13, 07, 440, 21, 120, 10,
07, 440, 12, 120, 6, 07, 440, 8, 120, 3, 07, 440, 2, 120, 29, 07, 440,
31, 120, 26, 07, 440, 25, 120, 22, 07, 440, 4, 120, 19, 07, 440, 20,
120, 15, 07, 440, 13, 120, 12, 07, 440, 5, 120, 8, 07, 440, 7, 120, 5,
07, 440, 11, 120, 1, 07, 440, 12, 120, 29, 07, 440, 13, 120, 25, 07,
440, 2, 120, 22, 07, 440, 12, 120, 18, 07, 440, 12, 120, 15, 07, 440,
19, 120, 11, 07, 440, 1, 120, 8, 07, 440, 9, 120, 4, 07, 440, 2, 120,
1, 07, 440, 9, 120, 28, 07, 440, 15, 120, 25, 07, 440, 28, 120, 21,
07, 440, 14, 120, 18, 07, 440, 3, 120, 14, 07, 440, 1, 120, 11, 07,
440, 8, 120, 7, 07, 440, 15, 120, 4, 07, 440, 1, 120, 30, 07, 440, 24,
120, 27, 07, 440, 9, 120, 23, 07, 440, 14, 120, 20, 07, 440, 23, 120,
16, 07, 440, 4, 120, 13, 07, 440, 10, 120, 9, 07, 440, 7, 120, 6, 07,
440, 5, 120, 2, 07, 440, 2, 120, 30, 07, 440, 7, 120, 26, 07, 440, 1,
120, 23, 07, 440, 3, 120, 19, 07, 440, 3, 120, 16, 07, 440, 6, 120,
12, 07, 440, 30, 120, 9, 07, 440, 2, 120, 5, 07, 440, 13, 120, 2, 07,
440, 1, 120, 28, 07, 440, 16, 120, 25, 07, 440, 12, 120, 21, 07, 440,
22, 120, 18, 07, 440, 6, 120, 14, 07, 440, 12, 120, 11, 07, 440, 6,
120, 7, 07, 440, 2, 120, 4, 07, 440, 19, 120, 31, 07, 440, 2, 120, 28,
07, 440, 6, 120, 24, 07, 440, 10, 120, 21, 07, 440, 16, 120, 17, 07,
440, 7, 120, 14, 07, 440, 4, 120, 10, 07, 440, 14, 120, 7, 07, 440,
13, 120, 3, 07, 440, 1, 120, 28, 07, 440, 13, 120, 24, 07, 440, 36,
120, 21, 07, 440, 2, 120, 17, 07, 440, 1, 120, 14, 07, 440, 3, 120,
10, 07, 440, 2, 120, 7, 07, 440, 4, 120, 3, 07, 440, 12, 120, 31, 07,
440, 2, 120, 27, 07, 440, 10, 120, 24, 07, 440, 9, 120, 20, 07, 440,
1, 120, 17, 07, 440, 16, 120, 13, 07, 440, 1, 120, 10, 07, 440, 36,
120, 6, 07, 440, 1, 120, 3, 07, 440, 10, 120, 30, 06, 440, 9, 120, 27,
06, 440, 14, 120, 23, 06, 440, 8, 120, 20, 06, 440, 1, 120, 16, 06,
440, 5, 120, 13, 06, 440, 19, 120, 9, 06, 440, 19, 120, 6, 06, 440, 7,
120, 2, 06, 440, 17, 120, 29, 06, 440, 2, 120, 25, 06, 440, 5, 120,
22, 06, 440, 22, 120, 18, 06, 440, 1, 120, 15, 06, 440, 11, 120, 11,
06, 440, 35 powerball : 22, 29, 31, 16, 25, 11, 11, 15, 30, 16, 30, 4,
33, 27, 9, 25, 16, 24, 12, 20, 19, 16, 8, 37, 15, 22, 10, 16, 23, 16,
19, 35, 30, 9, 21, 20, 21, 2, 38, 11, 15, 31, 8, 13, 10, 9, 22, 23, 5,
11, 19, 7, 44, 13, 21, 8, 22, 13, 26, 10, 21, 15, 28, 30, 5, 20, 38,
24, 17, 27, 18, 16, 5, 20, 38, 8, 15, 26, 11, 22, 13, 15, 19, 19, 5,
35, 21, 42, 24, 12, 23, 16, 27, 6, 17, 32, 22, 34, 34, 8, 18, 32, 8,
28, 38

BUT, when I run his other example (see below), everything worked as
well as the other examples you all supplied:

while (DATA) {
if (my @numbers =
  /(\d+), (\d+), (\d+), (\d+), (\d+), Powerball: (\d+)/) {
  push @common, @numbers[0..4];
  push @powerball, $numbers[5];
}
else {
  ...
}
  }

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-11 Thread Dr.Ruud

Rob Dixon schreef:
 Dr.Ruud:
 John W . Krahn:

/Powerball:/ and my @numbers = /\d+/g;

 I wouldn't use such a conditional my.

 There is no conditional 'my': it is a de[c]laration.

I call it a conditional my. A my can be just a declaration, or a
declaration and an initialisation. In this case only the initialisation
is conditional.

A my in a condition has special behaviour if the condition is constant
false: 0 and my $var; creates a static $var.

As I wrote: *I* wouldn't use *such* a conditional my. I put the
declaration on its own line, just before the conditional initialisation.

I sometimes use a conditional my if I want the static behaviour, but not
in production code. Perl 5.10 has static.

-- 
Affijn, Ruud

Gewoon is een tijger.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-10 Thread joy_peng

On Nov 10, 2007 5:10 PM, Omega -1911 [EMAIL PROTECTED] wrote:
What I will need to be able to do
 is place the most common 5 numbers  (before the word powerball) into
 an array then place the powerball numbers into another array. Thanks
 in advance.

 @liners = split /(\s\[0-9],\s)Powerball:\s[0-9]/,$data_string;

 _DATA_
 22, 29, 35, 46, 52, Powerball: 2, Power Play: 5
 1, 31, 38, 40, 53, Powerball: 42, Power Play: 2
 6, 16, 18, 29, 37, Powerball: 24, Power Play: 2



Hi,

I just think the data stru you need is a hash not two arrays.The
entire code can be:

use strict;
use warnings;
use Data::Dumper;

my %hash;

while(DATA) {
my ($li,$powerb) = /^(.+)\,\s*Powerball\:\s*(\d+)/;
$hash{$powerb} = [split/,/,$li];
}

print Dumper \%hash;


__DATA__
22, 29, 35, 46, 52, Powerball: 2, Power Play: 5
1, 31, 38, 40, 53, Powerball: 42, Power Play: 2
6, 16, 18, 29, 37, Powerball: 24, Power Play: 2

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-10 Thread Jonathan Lang

Omega -1911 wrote:
 @liners = split /(\s\[0-9],\s)Powerball:\s[0-9]/,$data_string;

Instead of split, just do a pattern match:

  ($a[0], $a[1], $a[2], $a[3], $a[4], $b) = /(\d+), (\d+), (\d+),
(\d+), (\d+), Powerball: (\d+)/;

This puts the first five numbers into the array @a, and puts the
powerball number into scalar $b.

Note that this tackles a single line of data.  To get everything,
cycle through the lines using a while (DATA) and push the results
onto the two arrays as you get them:

  push @common, @a; push @powerball, $b;

In whole, you get:

  while (DATA) {
($a[0], $a[1], $a[2], $a[3], $a[4], $b) = /(\d+), (\d+), (\d+),
(\d+), (\d+), Powerball: (\d+)/;
push @common, @a; push @powerball, $b;
  }

When you're done, @common is (22, 29, 35, 46, 52, 1, 31, 38, 40, 53,
6, 16, 18, 29, 37), and @powerball is (2, 42, 24).

-- 
Jonathan Dataweaver Lang

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-10 Thread Dr.Ruud

Jonathan Lang schreef:

   while (DATA) {
 ($a[0], $a[1], $a[2], $a[3], $a[4], $b) = /(\d+), (\d+), (\d+),
 (\d+), (\d+), Powerball: (\d+)/;
 push @common, @a; push @powerball, $b;
   }

A slightly different way to do that, is:

   while (DATA) {
 if (my @numbers =
   /(\d+), (\d+), (\d+), (\d+), (\d+), Powerball: (\d+)/) {
   push @common, @numbers[0..4];
   push @powerball, $numbers[5];
 }
 else {
   ...
 }
   }

-- 
Affijn, Ruud

Gewoon is een tijger.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-10 Thread Omega -1911

Thank you both Dr.Ruud  Jonathan Lang. I will give both examples a
try later today and let you know how it all turns out.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex Help

2007-11-10 Thread Omega -1911

Thank you both Dr.Ruud  Jonathan Lang. I will give both examples a
try later today and let you know how it all turns out.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-09-26 Thread [EMAIL PROTECTED]

On Sep 25, 4:33 pm, [EMAIL PROTECTED] (Rob Dixon) wrote:
 Jonathan Lang wrote:
  Rob Dixon wrote:
  Jonathan Lang wrote:
  I'm trying to devise a regex that matches from the first double-quote
  character found to the next double-quote character that isn't part of
  a pair; but for some reason, I'm having no luck.  Here's what I tried:

/(.*?)(?!)/

  Sample text:

author: Jonathan Dataweaver Lang key=val

  What I'm getting for $1 in the first match:

Jonathan 

  What I'm looking for:

Jonathan Dataweaver Lang

  What did I miss, and how can I most efficiently perform the desired match?
  Your regex looks for the first double-quote and then captures everything 
  after
  that up to the first subsequent double-quote that isn't followed 
  immediately by
  another one. The second quote of the pair before 'Dataweaver' matches this
  criterion so your regex captures up to the character before it.

  This:

$str =~ /((?:.*?)*.*?)/;

  should do what you want. After finding the first double-quote it captures 
  all
  following sequences ending in a pair of double quotes, plus anything after
  those up to the closing quote.

  Ah.  I had tried /((.*?)*.*?)/ and hadn't gotten it to work; it
  never occurred to me to try the non-capturing group instead.

 That also works! (But is performing unnecessary and wasteful captures.)

 Rob

 use strict;
 use warnings;

 my $str = q(author: Jonathan Dataweaver Lang key=val);

 $str =~ /((.*?)*.*?)/;
 print $1, \n;

 **OUTPUT**

 Jonathan Dataweaver Lang

use strict;
use warnings;

my $str = q(author: Jonathan Dataweaver Lang key=val fly-in-
ointment: Brian Nobull McCauley);

$str =~ /((.*?)*.*?)/;
print $1, \n;

__END__

**OUTPUT**

Jonathan Dataweaver Lang key=val fly-in-ointment: Brian
Nobull McCaule
y

An alternative pattern would be /((?:[^]*)*.*?)/ although the
behaviour or that may be counter-intuative if presented with bad input
in which there's no closing quote.


My perferred pattern would be much closer to Jonathan's original:

/((?:[^]|)*)(?!)/

This has the advantage of failing to match if presented with input
that lacks a closing quote.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-09-25 Thread Rob Dixon


Jonathan Lang wrote:


I'm trying to devise a regex that matches from the first double-quote
character found to the next double-quote character that isn't part of
a pair; but for some reason, I'm having no luck.  Here's what I tried:

  /(.*?)(?!)/

Sample text:

  author: Jonathan Dataweaver Lang key=val

What I'm getting for $1 in the first match:

  Jonathan 

What I'm looking for:

  Jonathan Dataweaver Lang

What did I miss, and how can I most efficiently perform the desired match?


Your regex looks for the first double-quote and then captures everything after
that up to the first subsequent double-quote that isn't followed immediately by
another one. The second quote of the pair before 'Dataweaver' matches this
criterion so your regex captures up to the character before it.

This:

 $str =~ /((?:.*?)*.*?)/;

should do what you want. After finding the first double-quote it captures all
following sequences ending in a pair of double quotes, plus anything after
those up to the closing quote.

HTH,

Rob

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-09-25 Thread Jonathan Lang

Rob Dixon wrote:
 Jonathan Lang wrote:
  I'm trying to devise a regex that matches from the first double-quote
  character found to the next double-quote character that isn't part of
  a pair; but for some reason, I'm having no luck.  Here's what I tried:
 
/(.*?)(?!)/
 
  Sample text:
 
author: Jonathan Dataweaver Lang key=val
 
  What I'm getting for $1 in the first match:
 
Jonathan 
 
  What I'm looking for:
 
Jonathan Dataweaver Lang
 
  What did I miss, and how can I most efficiently perform the desired match?

 Your regex looks for the first double-quote and then captures everything after
 that up to the first subsequent double-quote that isn't followed immediately 
 by
 another one. The second quote of the pair before 'Dataweaver' matches this
 criterion so your regex captures up to the character before it.

 This:

   $str =~ /((?:.*?)*.*?)/;

 should do what you want. After finding the first double-quote it captures all
 following sequences ending in a pair of double quotes, plus anything after
 those up to the closing quote.

Ah.  I had tried /((.*?)*.*?)/ and hadn't gotten it to work; it
never occurred to me to try the non-capturing group instead.

Thank you.

-- 
Jonathan Dataweaver Lang

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-09-25 Thread Rob Dixon


Jonathan Lang wrote:

Rob Dixon wrote:

Jonathan Lang wrote:

I'm trying to devise a regex that matches from the first double-quote
character found to the next double-quote character that isn't part of
a pair; but for some reason, I'm having no luck.  Here's what I tried:

  /(.*?)(?!)/

Sample text:

  author: Jonathan Dataweaver Lang key=val

What I'm getting for $1 in the first match:

  Jonathan 

What I'm looking for:

  Jonathan Dataweaver Lang

What did I miss, and how can I most efficiently perform the desired match?

Your regex looks for the first double-quote and then captures everything after
that up to the first subsequent double-quote that isn't followed immediately by
another one. The second quote of the pair before 'Dataweaver' matches this
criterion so your regex captures up to the character before it.

This:

  $str =~ /((?:.*?)*.*?)/;

should do what you want. After finding the first double-quote it captures all
following sequences ending in a pair of double quotes, plus anything after
those up to the closing quote.


Ah.  I had tried /((.*?)*.*?)/ and hadn't gotten it to work; it
never occurred to me to try the non-capturing group instead.


That also works! (But is performing unnecessary and wasteful captures.)

Rob



use strict;
use warnings;

my $str = q(author: Jonathan Dataweaver Lang key=val);

$str =~ /((.*?)*.*?)/;
print $1, \n;

**OUTPUT**

Jonathan Dataweaver Lang


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2007-09-04 Thread Beginner

On 3 Sep 2007 at 17:44, Rob Dixon wrote:

 Beginner wrote:
  
  I am trying to come up with a regex to squash multiple commas into
  one. The line I am working on looks like this:
  
  SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,  
  DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , ,
  
  There are instances of /,\s{1,},/ and /,,/ 
  
  The bit that I am struggling with is finding a way to get a use a
  multiplier for the regex /,\s+/ but I have to be careful not to 
  remove single entries. I guess the order of my substitutions is 
  important here.
  
  Can anyone offer any tips please?
 
 Hey Dermot.
 
 I think just
 
   $text =~ s/,[,\s]+/,/g;
 
Indeed Rob that works too. 

You've used square brackets for what I think they call 'alternation'; 
the next character might be a comma and a whitespace. I have always 
thought of square brackets as being for character classes EG: [a-z]. 
I associate alternation with parenthesis and the pipe /(this|that)/

perlrequick demos examples like:

/[a-z]+\s+\d*/;  # match a lowercase word, at least some space, and
 # any number of digits

but I don't think I've seen examples where there is a character class 
like \s or \w within square brackets before. 

Anyway back to reading perlretut, perlop and others.
Thanx,
Dp.



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2007-09-03 Thread John W. Krahn


Beginner wrote:

Hi,


Hello,

I am trying to come up with a regex to squash multiple commas into 
one. The line I am working on looks like this:


SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,  
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , 

There are instances of /,\s{1,},/ and /,,/ 

The bit that I am struggling with is finding a way to get a use a 
multiplier for the regex /,\s+/ but I have to be careful not to 
remove single entries. I guess the order of my substitutions is 
important here.


$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, , 
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ];

 print;
s/,\s*(?=,)//g;
print;
'
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, , 
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , ,
SPEED OF LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL, 
CONCEPT,CONCEPTS,



$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, , 
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ];

print;
$_ = join ,, grep /\S/, split /,/;
print;
'
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, , 
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , ,
SPEED OF LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL, 
CONCEPT,CONCEPTS





John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: Regex help

2007-09-03 Thread Andrew Curry

Christ That's certainly 1 way ;) 

-Original Message-
From: John W. Krahn [mailto:[EMAIL PROTECTED] 
Sent: 03 September 2007 16:11
To: Perl beginners
Subject: Re: Regex help

Beginner wrote:
 Hi,

Hello,

 I am trying to come up with a regex to squash multiple commas into 
 one. The line I am working on looks like this:

 SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, , 
 DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , ,

 There are instances of /,\s{1,},/ and /,,/

 The bit that I am struggling with is finding a way to get a use a 
 multiplier for the regex /,\s+/ but I have to be careful not to remove 
 single entries. I guess the order of my substitutions is important 
 here.

$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ];  
print; s/,\s*(?=,)//g; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS,

$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ]; print;
$_ = join ,, grep /\S/, split /,/; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS

John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED] For additional
commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/

This e-mail is from the PA Group.  For more information, see
www.thepagroup.com.

This e-mail may contain confidential information.  Only the addressee is
permitted to read, copy, distribute or otherwise use this email or any
attachments.  If you have received it in error, please contact the sender
immediately.  Any opinion expressed in this e-mail is personal to the sender
and may not reflect the opinion of the PA Group.

Any e-mail reply to this address may be subject to interception or
monitoring for operational reasons or for lawful business practices.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: Regex help

2007-09-03 Thread Andrew Curry

Think

s/(\,+\s*)+/,/g;

Should work

It produces
SPEED OF LIGHT,LIGHT
SPEED,TRAVEL,TRAVELLING,DANGER,DANGEROUS,PHYSICAL,CONCEPT,CONCEPTS

If that's what you want. 

-Original Message-
From: Andrew Curry 
Sent: 03 September 2007 16:14
To: 'John W. Krahn'; Perl beginners
Subject: RE: Regex help

Christ That's certainly 1 way ;) 

-Original Message-
From: John W. Krahn [mailto:[EMAIL PROTECTED]
Sent: 03 September 2007 16:11
To: Perl beginners
Subject: Re: Regex help

Beginner wrote:
 Hi,

Hello,

 I am trying to come up with a regex to squash multiple commas into 
 one. The line I am working on looks like this:
 
 SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, , 
 DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , ,
 
 There are instances of /,\s{1,},/ and /,,/
 
 The bit that I am struggling with is finding a way to get a use a 
 multiplier for the regex /,\s+/ but I have to be careful not to remove 
 single entries. I guess the order of my substitutions is important 
 here.

$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ];  
print; s/,\s*(?=,)//g; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS,


$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ]; print;
$_ = join ,, grep /\S/, split /,/; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS




John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED] For additional
commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/



This e-mail is from the PA Group.  For more information, see
www.thepagroup.com.

This e-mail may contain confidential information.  Only the addressee is
permitted to read, copy, distribute or otherwise use this email or any
attachments.  If you have received it in error, please contact the sender
immediately.  Any opinion expressed in this e-mail is personal to the sender
and may not reflect the opinion of the PA Group.

Any e-mail reply to this address may be subject to interception or
monitoring for operational reasons or for lawful business practices.





-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: Regex help

2007-09-03 Thread Beginner

On 3 Sep 2007 at 16:15, Andrew Curry wrote:

 Think
 
 s/(\,+\s*)+/,/g;
 
 Should work
 
 It produces
 SPEED OF LIGHT,LIGHT
 SPEED,TRAVEL,TRAVELLING,DANGER,DANGEROUS,PHYSICAL,CONCEPT,CONCEPTS
 
 If that's what you want. 

Exactly what I want. Thanx,
Dp.




-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: Regex help

2007-09-03 Thread Beginner

On 3 Sep 2007 at 16:12, Andrew Curry wrote:

 $ perl -le'
 $_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
 DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ];  
 print; s/,\s*(?=,)//g; print; '
 SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
 DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
 LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
 CONCEPT,CONCEPTS,
 
 
 $ perl -le'
 $_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
 DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ]; print;
 $_ = join ,, grep /\S/, split /,/; print; '
 SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
 DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
 LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
 CONCEPT,CONCEPTS
 
 
 
 
 John

Okay I need to ask what's going on here.  

I had to use the  

s/,\s*(?=,)//g  

expression because the  

s/(\,+\s*)+/,/g;  

regex in my code snip wasn't working as it did on the text snippet I 
originally supplied.  

=== code snip ===
 while (FH) { 
chomp($_);  
s///g; 
s/\t/, /g;  
s/,\s*(?=,)//g; 
print \$_\\n; 
}
 == 

I can understand the 2nd method: A grouped, literal comma (\,), one 
or more times followed by a zero or more spaces.  

The 2nd regex reads to me like, a comma then zero or more spaces but 
what's that (?=,) doing? Is it referring to the preceding expression 
and saying if it matches up to 1 time? I can't see what the equal 
sign is doing either.

Enlightment please.
Dp.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2007-09-03 Thread John W. Krahn


Beginner wrote:

On 3 Sep 2007 at 16:12, Andrew Curry wrote:


Please do not attribute to Andrew Curry a post that was actually submitted by 
me (see my name at the end there.)  TIA




$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ];  
print; s/,\s*(?=,)//g; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS,


$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ]; print;
$_ = join ,, grep /\S/, split /,/; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS




John


Okay I need to ask what's going on here.  

I had to use the  

s/,\s*(?=,)//g  

expression because the  

s/(\,+\s*)+/,/g;  

regex in my code snip wasn't working as it did on the text snippet I 
originally supplied.  


wasn't working is not a very good description of the problem.



=== code snip ===
 while (FH) {   
chomp($_);  


Why remove the newline and then add it back at the end of the loop?


s///g;


It is more efficient to use transliteration to remove characters from a string:

tr///d;


s/\t/, /g;  
s/,\s*(?=,)//g; 
	print \$_\\n; 


You could use different quoting so you don't have to escape the quotation marks:

 print qq[$_\n];


}
 == 

I can understand the 2nd method: A grouped, literal comma (\,), one 
or more times followed by a zero or more spaces.  

The 2nd regex reads to me like, a comma then zero or more spaces but 
what's that (?=,) doing?


It is a zero-width positive look-ahead assertion.  It says that a comma *must* 
follow the pattern but is not included as part of the pattern.



Is it referring to the preceding expression 
and saying if it matches up to 1 time? I can't see what the equal 
sign is doing either.


Enlightment please.


perldoc perlre




John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2007-09-03 Thread Rob Dixon


Beginner wrote:


I am trying to come up with a regex to squash multiple commas into 
one. The line I am working on looks like this:


SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,  
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , 

There are instances of /,\s{1,},/ and /,,/ 

The bit that I am struggling with is finding a way to get a use a 
multiplier for the regex /,\s+/ but I have to be careful not to 
remove single entries. I guess the order of my substitutions is 
important here.


Can anyone offer any tips please?


Hey Dermot.

I think just

 $text =~ s/,[,\s]+/,/g;

is all you need.

Rob


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2007-09-03 Thread John W. Krahn


Andrew Curry wrote:

Think

s/(\,+\s*)+/,/g;


Commas are not special in a regular expression so there is no need to escape 
them.  You are using capturing parentheses but are not using the string 
captured in $1, better to use non-capturing parentheses.


s/(?:,+\s*)+/,/g;


A modified pattern inside a modified group is inefficient and could bomb if 
the string is long enough:


$ perl -le'
$_ = ,  x 100_000;
s/(?:,+\s*)+/,/g;
print;
'
Segmentation fault (core dumped)

$ perl -le'
$_ = ,  x 100_000;
s/,\s*(?=,)//g;
print;
'
,




John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex help

2007-09-03 Thread John W. Krahn


[ Please do not top-post.  TIA ]


[EMAIL PROTECTED] wrote:

Hi


Hello,

Unless Perl is the only tool available to you in your toolbox and if 
you're running Linux or similar consider the tr -s  command in a 
shell.


Perl also has that:

tr/,//s;


perldoc perlop


However if you are strictly limited to Perl then this stand regex 
works:-

echo ,|perl -ane 's/,*/,/;print'


The OP's string also included spaces after the commas.  Why are you using the 
-a switch, which splits the current line on whitespace and stores it in the @F 
array, when you are not using the @F array?  Why use the -n switch and 'print' 
instead of just using the -p switch?


$ echo abc | perl -ane 's/,*/,/;print'
,abc

You are using a modifier that matches zero times so you are adding commas 
where none existed before.




John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: Regex help

2007-09-03 Thread asmith9983


Hi

Unless Perl is the only tool available to you in your toolbox and if you're 
running Linux or similar consider the tr -s  command in a shell. However if 
you are 
strictly limited to Perl then this stand regex works:-

echo ,|perl -ane 's/,*/,/;print'
Try it by cutting and pasting it.

No doubt you'll get lots of other answers,so choose the one you like best,and 
can remember.


--
Andrew in Edinburgh,Scotland

On Mon, 3 Sep 2007, Andrew Curry wrote:


Christ That's certainly 1 way ;)

-Original Message-
From: John W. Krahn [mailto:[EMAIL PROTECTED]
Sent: 03 September 2007 16:11
To: Perl beginners
Subject: Re: Regex help

Beginner wrote:

Hi,


Hello,


I am trying to come up with a regex to squash multiple commas into
one. The line I am working on looks like this:

SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , ,

There are instances of /,\s{1,},/ and /,,/

The bit that I am struggling with is finding a way to get a use a
multiplier for the regex /,\s+/ but I have to be careful not to remove
single entries. I guess the order of my substitutions is important
here.


$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ];  
print; s/,\s*(?=,)//g; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS,


$ perl -le'
$_ = q[SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , ]; print;
$_ = join ,, grep /\S/, split /,/; print; '
SPEED OF LIGHT, ,  LIGHT SPEED,TRAVEL,TRAVELLING, ,
DANGER,DANGEROUS,PHYSICAL, ,  CONCEPT,CONCEPTS, , , , , , , , , , SPEED OF
LIGHT,  LIGHT SPEED,TRAVEL,TRAVELLING,  DANGER,DANGEROUS,PHYSICAL,
CONCEPT,CONCEPTS




John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED] For additional
commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/



This e-mail is from the PA Group.  For more information, see
www.thepagroup.com.

This e-mail may contain confidential information.  Only the addressee is
permitted to read, copy, distribute or otherwise use this email or any
attachments.  If you have received it in error, please contact the sender
immediately.  Any opinion expressed in this e-mail is personal to the sender
and may not reflect the opinion of the PA Group.

Any e-mail reply to this address may be subject to interception or
monitoring for operational reasons or for lawful business practices.








--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-22 Thread Mumia W..


On 08/21/2007 07:41 AM, Tony Heal wrote:

the list is a list of files by version. I need to keep the last 5 versions.

Jeff's code works fine except I am getting some empty strings at the beginning 
that I have not figured out.

Here is what I have so far. Lines 34 and 39 are provide a print out for 
troubleshooting. Once I get this fixed all I
need to do is shift the top five from the list and unlink the rest.

#!/usr/bin/perl

use warnings;
use strict;

opendir (REPOSITORY, '/usr/local/repository/dists/');
my @repositories = readdir (REPOSITORY);
closedir (REPOSITORY);

my $packageRepo;
my @values;
my @newValues;
foreach (@repositories)
{
$packageRepo = $_;
chomp ($packageRepo);
opendir (packageREPO, 
/usr/local/repository/dists/$packageRepo/non-free/binary-i386);
my @repoFiles = readdir (packageREPO);
close (packageREPO);
foreach (@repoFiles)
{
my $fileName = $_;
chomp ($fileName);
if ( /(.*)(([0-9][0-9])(-special)?\.([0-9])(-)([0-9]*))(.*)/)
{
push (@values, $2);
}
}
my %h;
	foreach (@values) 
	{

push (@newValues, $_) unless $h{$_}++
}
foreach (@newValues){print $_\n;}
	my @new = map { $_-[0] } 
	sort { $b-[1] = $a-[1] } 
	map { [$_,(split/-/)[-1]] } 
	@newValues;

print @new[0..4]\n;
}


Or for a line numbered version
http://rafb.net/p/asqgJo27.html

Tony Heal





[oops, sent to the wrong list before]

Sort::Maker should make short work for this task ;-)

All you have to do is to make a regex to pull out the version numbers.
After that, you're practically done:

use strict;
use warnings;
require Sort::Maker;

open (pkgREPO, '', 'data/versions-list.txt')
or die no versions list: $!;

my @versions;

while (my $line = pkgREPO) {
chomp $line;
push @versions, [ $line, $line =~ /^(\d+)(?:-[a-z]+)?\.(\d+)-(\d+)/ ];
}

close pkgREPO;

my $sorter = Sort::Maker::make_sorter(
'ST',
number = '$_-[1]',
number = '$_-[2]',
number = '$_-[3]',
);
die $@ unless $sorter;

my @sorted = $sorter-(@versions);
print keep: $_-[0]\n for @sorted[$#sorted-4 .. $#sorted];
print delete: $_-[0]\n for @sorted[0 .. $#sorted-5];

__END__

This is the output:

keep: 16.5-2
keep: 16-special.5-2
keep: 16.5-10
keep: 16.5-13
keep: 16-special.6-6
delete: 14-special.1-2
delete: 14-special.1-8
delete: 14-special.1-15
delete: 14-special.2-40
delete: 14-special.2-41
delete: 14-special.3-4
delete: 14-special.3-7
delete: 14-special.3-12
delete: 15-special.1-52
delete: 15-special.1-53
delete: 15-special.1-54
delete: 15.2-108
delete: 15.2-110
delete: 15.2-111
delete: 15.3-12
delete: 16.1-17
delete: 16.1-22
delete: 16.1-23
delete: 16.1-39
delete: 16.3-1
delete: 16.3-6
delete: 16.3-7
delete: 16.3-8
delete: 16.3-15
delete: 16-special.4-9
delete: 16-special.4-10
delete: 16.5-1
delete: 16-special.5-1






--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread Mr. Shawn H. Corey

Jeff Pang wrote:

-Original Message-

From: Mr. Shawn H. Corey [EMAIL PROTECTED]
Sent: Aug 21, 2007 12:32 PM
To: Jeff Pang [EMAIL PROTECTED]
Cc: beginners@perl.org
Subject: Re: regex help

Jeff Pang wrote:

my @new = map { $_-[0] } sort { $b-[1] = $a-[1] } map { 
[$_,(split/-/)[-1]] } @arr;
print @new[0..4];

Fails; this would put '15-special.3-45' before '15-special.1-51'

Well,have you tested the codes then said this?
I sort it based on the last number field splited by '-'.It works fine for me.

The point is that only the OP can say what is significant.  And s/he hasn't.

--
Just my 0.0002 million dollars worth,
 Shawn

For the things we have to learn before we can do them, we learn by doing them.
 Aristotle

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-21 Thread Tony Heal

the list is a list of files by version. I need to keep the last 5 versions.

Jeff's code works fine except I am getting some empty strings at the beginning 
that I have not figured out.

Here is what I have so far. Lines 34 and 39 are provide a print out for 
troubleshooting. Once I get this fixed all I
need to do is shift the top five from the list and unlink the rest.

#!/usr/bin/perl

use warnings;
use strict;

opendir (REPOSITORY, '/usr/local/repository/dists/');
my @repositories = readdir (REPOSITORY);
closedir (REPOSITORY);

my $packageRepo;
my @values;
my @newValues;
foreach (@repositories)
{
$packageRepo = $_;
chomp ($packageRepo);
opendir (packageREPO, 
/usr/local/repository/dists/$packageRepo/non-free/binary-i386);
my @repoFiles = readdir (packageREPO);
close (packageREPO);
foreach (@repoFiles)
{
my $fileName = $_;
chomp ($fileName);
if ( /(.*)(([0-9][0-9])(-special)?\.([0-9])(-)([0-9]*))(.*)/)
{
push (@values, $2);
}
}
my %h;
foreach (@values) 
{
push (@newValues, $_) unless $h{$_}++
}
foreach (@newValues){print $_\n;}
my @new = map { $_-[0] } 
sort { $b-[1] = $a-[1] } 
map { [$_,(split/-/)[-1]] } 
@newValues;
print @new[0..4]\n;
}


Or for a line numbered version
http://rafb.net/p/asqgJo27.html

Tony Heal



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-21 Thread Tony Heal

Here is a sample of the versions that I am using.
16.1-17
16.1-22
16.1-23
16.1-39
16.3-1
16.3-6
16.3-7
16.3-8
16.3-15
16.5-1
16.5-2
16.5-10
16.5-13
15.3-12
15.2-108
14-special.1-2
14-special.1-8
14-special.1-15
14-special.2-40
14-special.2-41
14-special.3-4
14-special.3-7
14-special.3-12
15.2-110
15.2-111
15-special.1-52
15-special.1-53
15-special.1-54
16-special.4-9
16-special.4-10
16-special.5-1
16-special.5-2
16-special.6-6

Tony Heal
Pace Systems Group, Inc.
800-624-5999
[EMAIL PROTECTED]
 

 -Original Message-
 From: Tony Heal [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, August 21, 2007 8:42 AM
 To: beginners@perl.org
 Subject: RE: regex help
 
 the list is a list of files by version. I need to keep the last 5 versions.
 
 Jeff's code works fine except I am getting some empty strings at the 
 beginning that I have not
 figured out.
 
 Here is what I have so far. Lines 34 and 39 are provide a print out for 
 troubleshooting. Once I get
 this fixed all I
 need to do is shift the top five from the list and unlink the rest.
 
 #!/usr/bin/perl
 
 use warnings;
 use strict;
 
 opendir (REPOSITORY, '/usr/local/repository/dists/');
 my @repositories = readdir (REPOSITORY);
 closedir (REPOSITORY);
 
 my $packageRepo;
 my @values;
 my @newValues;
 foreach (@repositories)
 {
   $packageRepo = $_;
   chomp ($packageRepo);
   opendir (packageREPO, 
 /usr/local/repository/dists/$packageRepo/non-free/binary-i386);
   my @repoFiles = readdir (packageREPO);
   close (packageREPO);
   foreach (@repoFiles)
   {
   my $fileName = $_;
   chomp ($fileName);
   if ( /(.*)(([0-9][0-9])(-special)?\.([0-9])(-)([0-9]*))(.*)/)
   {
   push (@values, $2);
   }
   }
   my %h;
   foreach (@values)
   {
   push (@newValues, $_) unless $h{$_}++
   }
 foreach (@newValues){print $_\n;}
   my @new = map { $_-[0] }
   sort { $b-[1] = $a-[1] }
   map { [$_,(split/-/)[-1]] }
   @newValues;
   print @new[0..4]\n;
 }
 
 
 Or for a line numbered version
 http://rafb.net/p/asqgJo27.html
 
 Tony Heal
 
 
 
 --
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 http://learn.perl.org/



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread Chas Owens

On 8/21/07, Tony Heal [EMAIL PROTECTED] wrote:
 Here is a sample of the versions that I am using.
snip

Just to clarify, you have a version string with the following format:

{major}{custom tag}.{minor}-{build}

and you want the list sorted by major, then minor, then build.

#!/usr/bin/perl

use strict;
use warnings;

my @versions;
while (DATA) {
chomp;
die invalid format unless
my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
push @versions, [ $major, $minor, $build , $_];
}

print $_-[-1]\n for sort {
$a-[0] = $b-[0] or
$a-[1] = $b-[1] or
$a-[2] = $b-[2]
} @versions;

__DATA__
16.1-17
16.1-22
16.1-23
16.1-39
16.3-1
16.3-6
16.3-7
16.3-8
16.3-15
16.5-1
16.5-2
16.5-10
16.5-13
15.3-12
15.2-108
14-special.1-2
14-special.1-8
14-special.1-15
14-special.2-40
14-special.2-41
14-special.3-4
14-special.3-7
14-special.3-12
15.2-110
15.2-111
15-special.1-52
15-special.1-53
15-special.1-54
16-special.4-9
16-special.4-10
16-special.5-1
16-special.5-2
16-special.6-6

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-21 Thread Jeff Pang



-Original Message-
From: Tony Heal [EMAIL PROTECTED]
Sent: Aug 21, 2007 9:25 PM
To: [EMAIL PROTECTED], beginners@perl.org
Subject: RE: regex help

Here is a sample of the versions that I am using.
16.1-17
16.1-22
16.1-23
16.1-39
16.3-1
16.3-6
16.3-7
16.3-8
16.3-15
16.5-1
16.5-2
16.5-10
16.5-13
15.3-12
15.2-108
14-special.1-2
14-special.1-8
14-special.1-15
14-special.2-40
14-special.2-41
14-special.3-4
14-special.3-7
14-special.3-12
15.2-110
15.2-111
15-special.1-52
15-special.1-53
15-special.1-54
16-special.4-9
16-special.4-10
16-special.5-1
16-special.5-2
16-special.6-6


Ok try this way.It sort the version from high to low and output the first 5.

use strict;
use warnings;

my @arr = qw(16.1-17
16.1-22
16.1-23
16.1-39
16.3-1
16.3-6
16.3-7
16.3-8
16.3-15
16.5-1
16.5-2
16.5-10
16.5-13
15.3-12
15.2-108
14-special.1-2
14-special.1-8
14-special.1-15
14-special.2-40
14-special.2-41
14-special.3-4
14-special.3-7
14-special.3-12
15.2-110
15.2-111
15-special.1-52
15-special.1-53
15-special.1-54
16-special.4-9
16-special.4-10
16-special.5-1
16-special.5-2
16-special.6-6
);

my @new = map { $_-[0] } sort { $b-[1] = $a-[1] or $b-[2] = $a-[2] or 
$b-[3] = $a-[3] } map { [ $_, split/\D+/ ] } @arr;
print @new[0..4];

__END__

Good luck!

--
Jeff Pang - [EMAIL PROTECTED]
http://home.arcor.de/jeffpang/

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread Chas Owens

On 8/21/07, Jeff Pang [EMAIL PROTECTED] wrote:
snip
 my @new = map { $_-[0] } sort { $b-[1] = $a-[1] or $b-[2] =
 $a-[2] or $b-[3] = $a-[3] } map { [ $_, split/\D+/ ] } @arr;
snip

While splitting on non-number is a nifty solution, it would break if
the custom tag can contain a number (16-custom2.2-14).  It is better
to nail down the version number scheme and write a regex that pulls
the required info from it that throws an error if a version does not
match the scheme.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread Chas Owens

On 8/21/07, Jeff Pang [EMAIL PROTECTED] wrote:

 -Original Message-
 From: Chas Owens [EMAIL PROTECTED]
 Sent: Aug 21, 2007 10:01 PM
 To: Jeff Pang [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED], beginners@perl.org
 Subject: Re: regex help

 On 8/21/07, Jeff Pang [EMAIL PROTECTED] wrote:
 snip
  my @new = map { $_-[0] } sort { $b-[1] = $a-[1] or $b-[2] =
  $a-[2] or $b-[3] = $a-[3] } map { [ $_, split/\D+/ ] } @arr;
 snip

 While splitting on non-number is a nifty solution, it would break if
 the custom tag can contain a number (16-custom2.2-14).  It is better
 to nail down the version number scheme and write a regex that pulls
 the required info from it that throws an error if a version does not
 match the scheme.

 Have you seen this case on his datas?

I have seen a sampling of his data; if that is all of the data he has
then he can sort it by hand and doesn't need Perl.  Experience has
taught me to expect the worst from data.  You need to be able to
detect (if not recover from) malformed data and your split /\D/ will
just silently do the wrong thing (well, there might be some undef
warnings if the version were 12.4).  GIGO* is fine for custom
crafted one liners, but production quality code should at least make
an attempt to notice if the data is bad and signal the user/admin.

* Garbage In/Garbage Out

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread Jeff Pang

-Original Message-
From: Chas Owens [EMAIL PROTECTED]
Sent: Aug 21, 2007 10:01 PM
To: Jeff Pang [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED], beginners@perl.org
Subject: Re: regex help

On 8/21/07, Jeff Pang [EMAIL PROTECTED] wrote:
snip
 my @new = map { $_-[0] } sort { $b-[1] = $a-[1] or $b-[2] =
 $a-[2] or $b-[3] = $a-[3] } map { [ $_, split/\D+/ ] } @arr;
snip

While splitting on non-number is a nifty solution, it would break if
the custom tag can contain a number (16-custom2.2-14).  It is better
to nail down the version number scheme and write a regex that pulls
the required info from it that throws an error if a version does not
match the scheme.

Have you seen this case on his datas?

--
Jeff Pang - [EMAIL PROTECTED]
http://home.arcor.de/jeffpang/

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-21 Thread Tony Heal

OK I added this and I keep getting invalid format

foreach (@newValues){print $_\n;}
my @versions;
while (@newValues) 
{
chomp;
die invalid format unless
my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
push @versions, [ $major, $minor, $build , $_];
}
foreach (@versions){print $_\n;}
}

/tmp# ./trim.pl
14.20-33
14.20-34
14.18-29
14.18-33
14.18-34
14.18-35
14.18-37
14.20-27
14.20-28
14.20-29
14.20-30
14.20-31
14.20-32
14.16-30
14.16-31
invalid format at ./trim.pl line 41. (41 is the die line)


sorry Chas I first sent to you and not the list.

Tony Heal
Pace Systems Group, Inc.
800-624-5999
[EMAIL PROTECTED]
 

 -Original Message-
 From: Chas Owens [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, August 21, 2007 9:50 AM
 To: [EMAIL PROTECTED]
 Cc: beginners@perl.org
 Subject: Re: regex help
 
 On 8/21/07, Tony Heal [EMAIL PROTECTED] wrote:
  Here is a sample of the versions that I am using.
 snip
 
 Just to clarify, you have a version string with the following format:
 
 {major}{custom tag}.{minor}-{build}
 
 and you want the list sorted by major, then minor, then build.
 
 #!/usr/bin/perl
 
 use strict;
 use warnings;
 
 my @versions;
 while (DATA) {
 chomp;
 die invalid format unless
 my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
 push @versions, [ $major, $minor, $build , $_];
 }
 
 print $_-[-1]\n for sort {
 $a-[0] = $b-[0] or
 $a-[1] = $b-[1] or
 $a-[2] = $b-[2]
 } @versions;
 
 __DATA__
 16.1-17
 16.1-22
 16.1-23
 16.1-39
 16.3-1
 16.3-6
 16.3-7
 16.3-8
 16.3-15
 16.5-1
 16.5-2
 16.5-10
 16.5-13
 15.3-12
 15.2-108
 14-special.1-2
 14-special.1-8
 14-special.1-15
 14-special.2-40
 14-special.2-41
 14-special.3-4
 14-special.3-7
 14-special.3-12
 15.2-110
 15.2-111
 15-special.1-52
 15-special.1-53
 15-special.1-54
 16-special.4-9
 16-special.4-10
 16-special.5-1
 16-special.5-2
 16-special.6-6


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-21 Thread Tony Heal

OK I added this and I keep getting invalid format

foreach (@newValues){print $_\n;}
my @versions;
while (@newValues) 
{
chomp;
die invalid format unless
my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
push @versions, [ $major, $minor, $build , $_];
}
foreach (@versions){print $_\n;}
}

/tmp# ./trim.pl
14.20-33
14.20-34
14.18-29
14.18-33
14.18-34
14.18-35
14.18-37
14.20-27
14.20-28
14.20-29
14.20-30
14.20-31
14.20-32
14.16-30
14.16-31
invalid format at ./trim.pl line 41. (41 is the die line)

Tony Heal
Pace Systems Group, Inc.
800-624-5999
[EMAIL PROTECTED]
 

 -Original Message-
 From: Chas Owens [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, August 21, 2007 9:50 AM
 To: [EMAIL PROTECTED]
 Cc: beginners@perl.org
 Subject: Re: regex help
 
 On 8/21/07, Tony Heal [EMAIL PROTECTED] wrote:
  Here is a sample of the versions that I am using.
 snip
 
 Just to clarify, you have a version string with the following format:
 
 {major}{custom tag}.{minor}-{build}
 
 and you want the list sorted by major, then minor, then build.
 
 #!/usr/bin/perl
 
 use strict;
 use warnings;
 
 my @versions;
 while (DATA) {
 chomp;
 die invalid format unless
 my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
 push @versions, [ $major, $minor, $build , $_];
 }
 
 print $_-[-1]\n for sort {
 $a-[0] = $b-[0] or
 $a-[1] = $b-[1] or
 $a-[2] = $b-[2]
 } @versions;
 
 __DATA__
 16.1-17
 16.1-22
 16.1-23
 16.1-39
 16.3-1
 16.3-6
 16.3-7
 16.3-8
 16.3-15
 16.5-1
 16.5-2
 16.5-10
 16.5-13
 15.3-12
 15.2-108
 14-special.1-2
 14-special.1-8
 14-special.1-15
 14-special.2-40
 14-special.2-41
 14-special.3-4
 14-special.3-7
 14-special.3-12
 15.2-110
 15.2-111
 15-special.1-52
 15-special.1-53
 15-special.1-54
 16-special.4-9
 16-special.4-10
 16-special.5-1
 16-special.5-2
 16-special.6-6


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread Chas Owens

On 8/21/07, Tony Heal [EMAIL PROTECTED] wrote:
 OK I added this and I keep getting invalid format

 foreach (@newValues){print $_\n;}
 my @versions;
 while (@newValues)
 {
 chomp;
 die invalid format unless
 my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
 push @versions, [ $major, $minor, $build , $_];
 }
 foreach (@versions){print $_\n;}
 }
snip

That would be because the code makes no sense.  My example read the a
version at a time from the DATA file handle, transformed it, and
pushed it onto an array, then sorted the array and printed it.  Yours
has all of the versions in an array and tries to loop over the array
with a while loop (doesn't work to start with) and you never bother to
sort the data.  If you aren't reading from a file then you might as
well add the first loop back onto the Schwartzian transform (map -
sort - unmap).  Please note that

die bad format unless
my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;

is one statement and should be indented as above.  If you don't indent
it looks like the die and the assignment are unrelated.  If you find
the style confusing you may consider using this instead

my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/
or die bad format;



#!/usr/bin/perl

use strict;
use warnings;

#I don't know how you are getting these values
my @newValues = map { chomp; $_ } DATA;

print unsorted\n;
print $_\n for @newValues;

@newValues =
#unmap to recover the original data
map { $_-[0] }
#sort
sort {
$a-[1] = $b-[1] or
$a-[2] = $b-[2] or
$a-[3] = $b-[3]
}
#map into a sortable form
map {
die bad format unless
my ($major, $minor, $build) =
/(\d+)(?:-.+)?\.(\d+)-(\d+)/;
[$_, $major, $minor, $build]
}
@newValues;

print sorted\n;
print $_\n for @newValues;

__DATA__
16.1-17
16.1-22
16.1-23
16.1-39
16.3-1
16.3-6
16.3-7
16.3-8
16.3-15
16.5-1
16.5-2
16.5-10
16.5-13
15.3-12
15.2-108
14-special.1-2
14-special.1-8
14-special.1-15
14-special.2-40
14-special.2-41
14-special.3-4
14-special.3-7
14-special.3-12
15.2-110
15.2-111
15-special.1-52
15-special.1-53
15-special.1-54
16-special.4-9
16-special.4-10
16-special.5-1
16-special.5-2
16-special.6-6

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-21 Thread D. Bolliger

Tony Heal am Dienstag, 21. August 2007:
  -Original Message-
  From: Chas Owens [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, August 21, 2007 9:50 AM
  To: [EMAIL PROTECTED]
  Cc: beginners@perl.org
  Subject: Re: regex help
 
  On 8/21/07, Tony Heal [EMAIL PROTECTED] wrote:
   Here is a sample of the versions that I am using.
 
  snip
 
  Just to clarify, you have a version string with the following format:
 
  {major}{custom tag}.{minor}-{build}
 
  and you want the list sorted by major, then minor, then build.
 
  #!/usr/bin/perl
 
  use strict;
  use warnings;
 
  my @versions;
  while (DATA) {
  chomp;
  die invalid format unless
  my ($major, $minor, $build) =
  /(\d+)(?:-.+)?\.(\d+)-(\d+)/; push @versions, [ $major, $minor, $build ,
  $_];
  }
 
  print $_-[-1]\n for sort {
  $a-[0] = $b-[0] or
  $a-[1] = $b-[1] or
  $a-[2] = $b-[2]
  } @versions;
 
  __DATA__
  16.1-17
[snip]
  16-special.4-10
  16-special.5-1
  16-special.5-2
  16-special.6-6

Hello Tony

Just include the original line in the die message to see what caused it (an 
empty line would for example). 
Based on that, you can then adapt the regex.

 OK I added this and I keep getting invalid format

 foreach (@newValues){print $_\n;}
   my @versions;
   while (@newValues)
   {
   chomp;
   die invalid format unless

die invalid format of '$_' unless

   my ($major, $minor, $build) = /(\d+)(?:-.+)?\.(\d+)-(\d+)/;
   push @versions, [ $major, $minor, $build , $_];
   }
   foreach (@versions){print $_\n;}
 }

 /tmp# ./trim.pl
 14.20-33
[snip]
 14.16-31
 invalid format at ./trim.pl line 41. (41 is the die line)


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-20 Thread Mr. Shawn H. Corey


Tony Heal wrote:

I have an array that will have these values. Each value is part of a file name. 
I need to keep the highest (numerically)
5 files and delete the rest.  What is the easiest to sort the array.


Break each file name into fields and sort by most significant field to least.  Use 
the Schwartzian Transform http://en.wikipedia.org/wiki/Schwartzian_Transform 
to sort.

See:
perldoc perlretut
perldoc perlre


--
Just my 0.0002 million dollars worth,
 Shawn

For the things we have to learn before we can do them, we learn by doing them.
 Aristotle

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-20 Thread Jeff Pang



-Original Message-
From: Tony Heal [EMAIL PROTECTED]
Sent: Aug 21, 2007 5:50 AM
To: beginners@perl.org
Subject: regex help

I have an array that will have these values. Each value is part of a file 
name. I need to keep the highest (numerically)
5 files and delete the rest.  What is the easiest to sort the array.


Well,it can be sorted but follow which field in the filename?the last numerical 
field?

Just show a way,

use strict;
use warnings;

my @arr = qw(14-special.4-32
14-special.4-32
14-special.4-33
14-special.4-33
15-special.1-51
15-special.1-51
15-special.1-52
15-special.1-52
15-special.1-52
15-special.1-53
15-special.1-53
15-special.1-53
15-special.1-54
15-special.1-54
15-special.3-44
15-special.3-44
15-special.3-45
15-special.3-45
15-special.4-4
15-special.4-4
15.2-100
15.2-100
15.2-104
15.2-104
15.2-124
15.2-124
15.2-65
15.2-65
15.2-66
15.2-66);

my @new = map { $_-[0] } sort { $b-[1] = $a-[1] } map { 
[$_,(split/-/)[-1]] } @arr;
print @new[0..4];


--
Jeff Pang - [EMAIL PROTECTED]
http://home.arcor.de/jeffpang/

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-20 Thread Mr. Shawn H. Corey


Jeff Pang wrote:

use strict;
use warnings;

my @arr = qw(14-special.4-32
14-special.4-32
14-special.4-33
14-special.4-33
15-special.1-51
15-special.1-51
15-special.1-52
15-special.1-52
15-special.1-52
15-special.1-53
15-special.1-53
15-special.1-53
15-special.1-54
15-special.1-54
15-special.3-44
15-special.3-44
15-special.3-45
15-special.3-45
15-special.4-4
15-special.4-4
15.2-100
15.2-100
15.2-104
15.2-104
15.2-124
15.2-124
15.2-65
15.2-65
15.2-66
15.2-66);

my @new = map { $_-[0] } sort { $b-[1] = $a-[1] } map { 
[$_,(split/-/)[-1]] } @arr;
print @new[0..4];


Fails; this would put '15-special.3-45' before '15-special.1-51'

As I said, separate the data into fields, based on your knowledge of how to do 
it.  (Nobody on this list knows how.)

Then you can sort.


--
Just my 0.0002 million dollars worth,
 Shawn

For the things we have to learn before we can do them, we learn by doing them.
 Aristotle

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-20 Thread Jeff Pang



-Original Message-
From: Mr. Shawn H. Corey [EMAIL PROTECTED]
Sent: Aug 21, 2007 12:32 PM
To: Jeff Pang [EMAIL PROTECTED]
Cc: beginners@perl.org
Subject: Re: regex help

Jeff Pang wrote:
 use strict;
 use warnings;
 
 my @arr = qw(14-special.4-32
 14-special.4-32
 14-special.4-33
 14-special.4-33
 15-special.1-51
 15-special.1-51
 15-special.1-52
 15-special.1-52
 15-special.1-52
 15-special.1-53
 15-special.1-53
 15-special.1-53
 15-special.1-54
 15-special.1-54
 15-special.3-44
 15-special.3-44
 15-special.3-45
 15-special.3-45
 15-special.4-4
 15-special.4-4
 15.2-100
 15.2-100
 15.2-104
 15.2-104
 15.2-124
 15.2-124
 15.2-65
 15.2-65
 15.2-66
 15.2-66);
 
 my @new = map { $_-[0] } sort { $b-[1] = $a-[1] } map { 
 [$_,(split/-/)[-1]] } @arr;
 print @new[0..4];

Fails; this would put '15-special.3-45' before '15-special.1-51'


Well,have you tested the codes then said this?
I sort it based on the last number field splited by '-'.It works fine for me.


--
Jeff Pang - [EMAIL PROTECTED]
http://home.arcor.de/jeffpang/

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-08 Thread Dr.Ruud

Jeff Pang schreef:
 John W. Krahn:
 Tony Heal:

 Why doesn't this work? I want to take any leading 
 or trailing white spaces out. 
 
 perldoc -q How do I strip blank space
 
 Or generally it could be done by,
 $string =~ s/^\s+|\s+$//g;

The g-modifier doesn't mean generally nor good. ;-) 
Please see the suggested perldoc text for the proper ways. 

I like to use:

  s/^\s+//, s/\s+$// for $string;

but

  $string =~ s/^\s+//;
  $string =~ s/\s+$//;

may be slightly faster.
(like because no localization of $_) 

-- 
Affijn, Ruud

Gewoon is een tijger.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-08 Thread Dan Sopher

This works in a one-liner:

$string =~ s/^\s*(.*\S)\s*$/$1/;

Cheers!

-Dan



-Original Message-
From: Dr.Ruud [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, August 08, 2007 2:05 PM
To: beginners@perl.org
Subject: Re: regex help

Jeff Pang schreef:
 John W. Krahn:
 Tony Heal:

 Why doesn't this work? I want to take any leading 
 or trailing white spaces out. 
 
 perldoc -q How do I strip blank space
 
 Or generally it could be done by,
 $string =~ s/^\s+|\s+$//g;

The g-modifier doesn't mean generally nor good. ;-) 
Please see the suggested perldoc text for the proper ways. 

I like to use:

  s/^\s+//, s/\s+$// for $string;

but

  $string =~ s/^\s+//;
  $string =~ s/\s+$//;

may be slightly faster.
(like because no localization of $_) 

-- 
Affijn, Ruud

Gewoon is een tijger.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-08 Thread John W. Krahn


Dan Sopher wrote:

This works in a one-liner:

$string =~ s/^\s*(.*\S)\s*$/$1/;

Cheers!


Let's compare Dan's one-liner to the solutions in the FAQ (perlfaq4):

$ perl -le'
for ( \nX\n, \nX, X\n, X, \n\n\n, \n,  ) {
$a = $b = $c = $_;

$d  = $a =~ s/^\s*(.*\S)\s*$/$1/;

$e  = $b =~ s/^\s+//;
$e += $b =~ s/\s+$//;

$f  = $c =~ s/^\s+|\s+$//g;

print Test: , ++$g, Length of original: , length( $_ ), \n,
  Dan\047s length: , length( $a ),  on a string that was, $d ?  
:  NOT,  modified.\n,
  FAQ 1 length: ,length( $b ),  on a string that was, $e ?  
:  NOT,  modified.\n,
  FAQ 2 length: ,length( $c ),  on a string that was, $f ?  
:  NOT,  modified.\n;

}
'
Test: 1 Length of original: 3
Dan's length: 1 on a string that was modified.
FAQ 1 length: 1 on a string that was modified.
FAQ 2 length: 1 on a string that was modified.

Test: 2 Length of original: 2
Dan's length: 1 on a string that was modified.
FAQ 1 length: 1 on a string that was modified.
FAQ 2 length: 1 on a string that was modified.

Test: 3 Length of original: 2
Dan's length: 1 on a string that was modified.
FAQ 1 length: 1 on a string that was modified.
FAQ 2 length: 1 on a string that was modified.

Test: 4 Length of original: 1
Dan's length: 1 on a string that was modified.
FAQ 1 length: 1 on a string that was NOT modified.
FAQ 2 length: 1 on a string that was NOT modified.

Test: 5 Length of original: 3
Dan's length: 3 on a string that was NOT modified.
FAQ 1 length: 0 on a string that was modified.
FAQ 2 length: 0 on a string that was modified.

Test: 6 Length of original: 1
Dan's length: 1 on a string that was NOT modified.
FAQ 1 length: 0 on a string that was modified.
FAQ 2 length: 0 on a string that was modified.

Test: 7 Length of original: 0
Dan's length: 0 on a string that was NOT modified.
FAQ 1 length: 0 on a string that was NOT modified.
FAQ 2 length: 0 on a string that was NOT modified.



John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.-- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-02 Thread Chas Owens

On 8/2/07, Tony Heal [EMAIL PROTECTED] wrote:
snip
 Why doesn't this work? I want to take any leading or trailing white spaces 
 out.
 If I remove the remark it works, but I
 do not understand why it requires the second line
 $string =~ s/^(\s+)(.*)(\s+)$/$2/;
snip

Because (.*) matches all but the one space needed by the second (\s+).
 The . matches everything including the spaces.  You can fix this by
saying

$string =~ s/^(\s+)(.*?)(\s+)$/$2/;

to make (.*) match the smallest pattern (non-greedy) instead of the
largest (greedy).

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-02 Thread Ricky Zhou

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tony Heal wrote:
 Why doesn't this work? I want to take any leading or trailing white spaces 
 out. If I remove the remark it works, but I
 do not understand why it requires the second line
For reference, perldoc perlre and search for greedy.

Basically, the .* matches as much as possible, so it gets the spaces as
well.  To make it not greedy, you add a ?, so
$string =~ s/^\s+(.*?)\s+$/$1/;
would work.

Hope this helps,
Ricky
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFGskSdZBKKLMyvSE4RAmoLAJ9FPUqm+9utecURkec0gMWItfKEYACgmpeS
lf1qanHZefDeV5z87LMusWo=
=8U17
-END PGP SIGNATURE-

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

RE: regex help

2007-08-02 Thread Tony Heal

So since '?' will match the last character, group, or class 0 or 1 time the it 
matches the group of whatever happens to
be in '.*' up to any spaces that are attached to the '$'.

Is that correct?

Tony Heal


 -Original Message-
 From: Chas Owens [mailto:[EMAIL PROTECTED]
 Sent: Thursday, August 02, 2007 4:55 PM
 To: [EMAIL PROTECTED]
 Cc: beginners@perl.org
 Subject: Re: regex help
 
 On 8/2/07, Tony Heal [EMAIL PROTECTED] wrote:
 snip
  Why doesn't this work? I want to take any leading or trailing white spaces 
  out.
  If I remove the remark it works, but I
  do not understand why it requires the second line
  $string =~ s/^(\s+)(.*)(\s+)$/$2/;
 snip
 
 Because (.*) matches all but the one space needed by the second (\s+).
  The . matches everything including the spaces.  You can fix this by
 saying
 
 $string =~ s/^(\s+)(.*?)(\s+)$/$2/;
 
 to make (.*) match the smallest pattern (non-greedy) instead of the
 largest (greedy).


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex help

2007-08-02 Thread Chas Owens

On 8/2/07, Tony Heal [EMAIL PROTECTED] wrote:
 So since '?' will match the last character, group, or class 0 or 1 time the 
 it matches the group of whatever happens to
 be in '.*' up to any spaces that are attached to the '$'.

 Is that correct?
snip

No, the ? in .*? is not the same as the ? in [abc]?  just like neither
of them are the same as the ? in (?foo)  The character is being
reused, but the meanings are completely separate.  The ? character
when used with a quantifier (i.e. *, +, ?, {n}, or {n,m}) means match
the smallest possible string (non-greedy).  The default for those
modifiers is to match the largest string possible (greedy).

from perldoc perlre:
   The following standard quantifiers are recognized:

   *  Match 0 or more times
   +  Match 1 or more times
   ?  Match 1 or 0 times
   {n}Match exactly n times
   {n,}   Match at least n times
   {n,m}  Match at least n but not more than m times
snip
   By default, a quantified subpattern is greedy, that is, it will match
   as many times as possible (given a particular starting location) while
   still allowing the rest of the pattern to match.  If you want it to
   match the minimum number of times possible, follow the quantifier with
   a ?.  Note that the meanings don't change, just the greediness:

   *? Match 0 or more times
   +? Match 1 or more times
   ?? Match 0 or 1 time
   {n}?   Match exactly n times
   {n,}?  Match at least n times
   {n,m}? Match at least n but not more than m times

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

1 2 >

1 - 100 of 186 matches

Mail list logo