Re: split based on n number of occurances of a character

Chris Charley Fri, 26 Oct 2007 17:24:34 -0700

----- Original Message -----From: ""Jeff Pang"" <[EMAIL PROTECTED]>

Newsgroups: perl.beginners
To: "Mahurshi Akilla" <[EMAIL PROTECTED]>
Cc: <beginners@perl.org>
Sent: Friday, October 26, 2007 6:51 AM
Subject: Re: split based on n number of occurances of a character

On 10/26/07, Mahurshi Akilla <[EMAIL PROTECTED]> wrote:

Is there an easy way (without writing our own proc) to split a string
based on number of occurances of a character ?

for example:

$my_string = "a;b;c;d;e;f;g;h"

@regsplitarray = split (/;/, $my_string)   splits based on every 1
occurance of ";"

what if i want it to do 2 at a time ?  e.g. $regsplitarray[0] =
"a;b" .. $regsplitarray[1] = "c;d" .. and so on..

In this example, I used 2..  It could be n in general.  Is there a way
to do this without writing our own procedure ?


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

using a regex is may suitable.

$ perl -e '$x= "a;b;c;d;e;f;g;h";@re=$x=~/(\w+;\w+);{0,1}/g;print "@re"'
a;b c;d e;f g;h


The regexpr that Jeff provided will work if there is an even amount of
items, but not if the number of items is odd. The regular expression would
miss the 'i' at the end of the string (below).

my $my_string = "a;b;c;d;e;f;g;h;i";
my @re= $my_string =~/(\w+;\w+);{0,1}/g;

print join "\n", @re;

a;b
c;d
e;f
g;h


To capture the lone 'i' at the end of the string, you would need a regexpr
like this:

my @pairs = $my_string =~ /(\w+(?:;\w+){0,1})/g;
print join "\n", @pairs;

a;b
c;d
e;f
g;h
i

The regular expression above can be explained as:

* one or more word chars, \w+
* plus 0 or 1 groups of a semicolon plus one or more word chars:
(?:;\w+){0,1}
* the second parenthesis group, (?:...), just groups the second group and
doesn't capture, but instead, is a group to apply the 0 or 1, {0,1},
quantifier to. The ?: notation says don't capture.
* the overall group, including the first set of word chars, is captured
between the first left parens, '(', and the final right parens, ')'.

Also, to capture groups of 3, for example, you would only need to change the
quantifier from {0,1} to {0,2}.
my @threes= $my_string =~ /(\w+(?:;\w+){0,2})/g;
print join "\n", @threes;

a;b;c
d;e;f
g;h;i


(From perldoc perlrequick)

\w is a word character (alphanumeric or _) and represents:  [0-9a-zA-Z_]

Mahurshi, you may want to read the manual sections to understand regexps. At
the command line type:
perldoc perlrequick
perldoc perlretut

Hope this helps,
Chris


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: split based on n number of occurances of a character

Reply via email to