looking for suggestions
all, below is a sub i created to try to properly capitalize surnames of irish/scottish descent, converting Macarthur => MacArthur, o'donnell => O'Donnell, etc. it works as intended but i was wondering if anyone can suggest improvements in size and efficiency (realizing the two are not necessarily compatible). also rules for any additional naming styles would be appreciated since the sites using this will have a fairly global audience, name-wise. thanks in advance, joe # print &surname(NAME => $ARGV[0]) . "\n"; # SUB SURNAME # removes leading/trailing whitespace # consolidates grouped whitespaces into single whitespaces # capitalizes first letter after "Mac/Mc/'" in name (names of Scottish/Irish descent) # capitalizes first letter of name upon return sub surname { my %options = @_; # $options{NAME} = name to capitalize internally, if appropriate # remove leading and trailing whitespace, consolidate whitespace groupings into single whitespaces $options{NAME} = join(' ', split(' ', $options{NAME})); if ($options{NAME} =~ /^M[a]*c|'/i) { $options{NAME} =~ m/c|'/g; my $pos = pos $options{NAME}; substr($options{NAME}, $pos, 1) = uc(substr($options{NAME}, $pos, 1)); } # end of if ($options{NAME} =~ /^M[a]*c|'/) return(ucfirst($options{NAME})); } # end of sub surname # -- since this is a gmail account, please verify the mailing list is included in the reply to addresses -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: looking for suggestions
On 10-10-07 02:08 PM, jm wrote: it works as intended but i was wondering if anyone can suggest improvements in size and efficiency See `perldoc perlre` and search for /\\u/, /\\U/, /\\l/, and /\\L/. -- Just my 0.0002 million dollars worth, Shawn Programming is as much about organization and communication as it is about coding. The secret to great software: Fail early & often. Eliminate software piracy: use only FLOSS. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: looking for suggestions
jm wrote: all, Hello, below is a sub i created to try to properly capitalize surnames of irish/scottish descent, converting Macarthur => MacArthur, o'donnell => O'Donnell, etc. it works as intended but i was wondering if anyone can suggest improvements in size and efficiency (realizing the two are not necessarily compatible). also rules for any additional naming styles would be appreciated since the sites using this will have a fairly global audience, name-wise. thanks in advance, joe # print&surname(NAME => $ARGV[0]) . "\n"; # SUB SURNAME # removes leading/trailing whitespace # consolidates grouped whitespaces into single whitespaces # capitalizes first letter after "Mac/Mc/'" in name (names of Scottish/Irish descent) # capitalizes first letter of name upon return sub surname { my %options = @_; Do you really need a hash for a single value? Why not just a scalar? # $options{NAME} = name to capitalize internally, if appropriate # remove leading and trailing whitespace, consolidate whitespace groupings into single whitespaces $options{NAME} = join(' ', split(' ', $options{NAME})); if ($options{NAME} =~ /^M[a]*c|'/i) Why the use of the character class with just one character? /[a]*/ and /a*/ do exactly the same thing. And does that mean that "Maaac" is valid because that is what the pattern matches. Perhaps you want /^Ma?c|'/i. { $options{NAME} =~ m/c|'/g; Why are you running another regular expression? And why are you using the /g option? my $pos = pos $options{NAME}; substr($options{NAME}, $pos, 1) = uc(substr($options{NAME}, $pos, 1)); What happens if $options{NAME} only contains "Mac"? } # end of if ($options{NAME} =~ /^M[a]*c|'/) return(ucfirst($options{NAME})); } # end of sub surname # John -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. -- Albert Einstein -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: looking for suggestions
On Thu, Oct 7, 2010 at 1:22 PM, Shawn H Corey wrote: > On 10-10-07 02:08 PM, jm wrote: >> >> it works as intended but i was wondering if anyone can suggest >> improvements in size and efficiency > > See `perldoc perlre` and search for /\\u/, /\\U/, /\\l/, and /\\L/. > > > -- > Just my 0.0002 million dollars worth, > Shawn > > Programming is as much about organization and communication > as it is about coding. > > The secret to great software: Fail early & often. > > Eliminate software piracy: use only FLOSS. > > -- > To unsubscribe, e-mail: beginners-unsubscr...@perl.org > For additional commands, e-mail: beginners-h...@perl.org > http://learn.perl.org/ > > > Shawn and John, thanks, your leads gave me this: # #!/usr/bin/perl print &surname($ARGV[0]) . "\n"; # SUB SURNAME # removes leading/trailing whitespace # consolidates grouped whitespaces into single whitespaces # capitalizes first letter after "Mac/Mc/'" in name (names of Scottish/Irish descent) # capitalizes first letter of name upon return sub surname { my $name = shift; $name = join(' ', split(' ', $name)); $name =~ s/(^[Mm]a?c|.')(.*)/\u$1\u$2/; return(ucfirst($name)); } # end of sub surname # John, to answer some of your questions: the hash was legacy from earlier subs i've created, to allow for a more generic structure. i don't forsee that necessity here so i changed to a scalar. i also changed the first regex to use a?; not as comfortable with regex's as i'd like yet. the 2nd regex was required to allow the pos function to extract the position of the desired character. per the docs, the /g is a requirement for pos (at least as i understand it). since 'mac' is ignored by the substitution (as is any other 'conventional' name) the ucfirst takes care of all those upon return(). i'm thinking about trying to include the whitespace cleanup in the s/// but i'm thinking it'll be an ugly piece of code i'll always have trouble understanding. again, thanks for your help, gentlemen. joe -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: looking for suggestions
On 10/7/10 Thu Oct 7, 2010 12:20 PM, "jm" scribbled: > > Shawn and John, > > thanks, your leads gave me this: > > # > #!/usr/bin/perl > > print &surname($ARGV[0]) . "\n"; > > > # SUB SURNAME > # removes leading/trailing whitespace > # consolidates grouped whitespaces into single whitespaces > # capitalizes first letter after "Mac/Mc/'" in name (names of > Scottish/Irish descent) > # capitalizes first letter of name upon return > sub surname > { > my $name = shift; > $name = join(' ', split(' ', $name)); > $name =~ s/(^[Mm]a?c|.')(.*)/\u$1\u$2/; > return(ucfirst($name)); > } # end of sub surname > # > > i'm thinking about trying to include the whitespace cleanup in the > s/// but i'm thinking it'll be an ugly piece of code i'll always have > trouble understanding. Use a separate regex instead of the join/split: $name =~ s/\s+/ /g; Not ugly. Easy to understand: "substitute any substring of one or more whitespace characters with a single space character". Don't try to add this to your other regex. I am not sure that can even be done. I am sure that it is not worth it. Here is one perhaps more specific to your problem that may be a little harder to understand: $name =~ s/ {2,}/ /g; That one will not substitute a single space with a single space, but you are not likely to notice the difference in execution speed (if there even is one). \s includes spaces, tabs, and newlines, so they are not exactly equivalent. Other possibilities: $name =~ s/\s{2,}/ /g; $name =~ s/[ ]{2,}/ /g; $name =~ s/\s\s+/ /g; -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: looking for suggestions
On Thu, Oct 7, 2010 at 3:01 PM, Jim Gibson wrote: > On 10/7/10 Thu Oct 7, 2010 12:20 PM, "jm" scribbled: > > >> >> Shawn and John, >> >> thanks, your leads gave me this: >> >> # >> #!/usr/bin/perl >> >> print &surname($ARGV[0]) . "\n"; >> >> >> # SUB SURNAME >> # removes leading/trailing whitespace >> # consolidates grouped whitespaces into single whitespaces >> # capitalizes first letter after "Mac/Mc/'" in name (names of >> Scottish/Irish descent) >> # capitalizes first letter of name upon return >> sub surname >> { >> my $name = shift; >> $name = join(' ', split(' ', $name)); >> $name =~ s/(^[Mm]a?c|.')(.*)/\u$1\u$2/; >> return(ucfirst($name)); >> } # end of sub surname >> # >> > > >> i'm thinking about trying to include the whitespace cleanup in the >> s/// but i'm thinking it'll be an ugly piece of code i'll always have >> trouble understanding. > > Use a separate regex instead of the join/split: > > $name =~ s/\s+/ /g; > > Not ugly. Easy to understand: "substitute any substring of one or more > whitespace characters with a single space character". > > Don't try to add this to your other regex. I am not sure that can even be > done. I am sure that it is not worth it. > > Here is one perhaps more specific to your problem that may be a little > harder to understand: > > $name =~ s/ {2,}/ /g; > > That one will not substitute a single space with a single space, but you are > not likely to notice the difference in execution speed (if there even is > one). \s includes spaces, tabs, and newlines, so they are not exactly > equivalent. > > Other possibilities: > > $name =~ s/\s{2,}/ /g; > $name =~ s/[ ]{2,}/ /g; > $name =~ s/\s\s+/ /g; > > > > -- > To unsubscribe, e-mail: beginners-unsubscr...@perl.org > For additional commands, e-mail: beginners-h...@perl.org > http://learn.perl.org/ > > > jim, thanks. i'd initially considered separate regex's for whitespace but decided on the join because it takes care of all whitespace (leading/trailing/embedded) in one fell swoop. i won't be trying to combine the join with the existing regex; decided i'm not that much of a glutton for punishment. i actually did understand the {2,} so maybe i'm not as far out in the cold as i'd feared :) . i appreciate your insights and suggestions. joe -- since this is a gmail account, please verify the mailing list is included in the reply to addresses -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: looking for suggestions
> "JG" == Jim Gibson writes: >> i'm thinking about trying to include the whitespace cleanup in the >> s/// but i'm thinking it'll be an ugly piece of code i'll always have >> trouble understanding. JG> Use a separate regex instead of the join/split: JG> $name =~ s/\s+/ /g; normally i would agree, but his split also deleted leading and trailing whitespace since split ' ' has that special side effect. JG> Not ugly. Easy to understand: "substitute any substring of one or more JG> whitespace characters with a single space character". even better and faster is to use tr/ //s (assuming only spaces and not tabs/newlines, etc). uri -- Uri Guttman -- u...@stemsystems.com http://www.sysarch.com -- - Perl Code Review , Architecture, Development, Training, Support -- - Gourmet Hot Cocoa Mix http://bestfriendscocoa.com - -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: looking for suggestions
jm wrote: Shawn and John, thanks, your leads gave me this: # #!/usr/bin/perl print&surname($ARGV[0]) . "\n"; # SUB SURNAME # removes leading/trailing whitespace # consolidates grouped whitespaces into single whitespaces # capitalizes first letter after "Mac/Mc/'" in name (names of Scottish/Irish descent) # capitalizes first letter of name upon return sub surname { my $name = shift; $name = join(' ', split(' ', $name)); $name =~ s/(^[Mm]a?c|.')(.*)/\u$1\u$2/; return(ucfirst($name)); } # end of sub surname # John, to answer some of your questions: the hash was legacy from earlier subs i've created, to allow for a more generic structure. i don't forsee that necessity here so i changed to a scalar. i also changed the first regex to use a?; not as comfortable with regex's as i'd like yet. the 2nd regex was required to allow the pos function to extract the position of the desired character. per the docs, the /g is a requirement for pos (at least as i understand it). You could use the @+ and @- arrays to find the start and end position of a regular expression. perldoc perlvar since 'mac' is ignored by the substitution (as is any other 'conventional' name) the ucfirst takes care of all those upon return(). i'm thinking about trying to include the whitespace cleanup in the s/// but i'm thinking it'll be an ugly piece of code i'll always have trouble understanding. John -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. -- Albert Einstein -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: looking for suggestions
On Thursday 07 October 2010 21:20:19 jm wrote: > On Thu, Oct 7, 2010 at 1:22 PM, Shawn H Corey wrote: > > On 10-10-07 02:08 PM, jm wrote: > >> it works as intended but i was wondering if anyone can suggest > >> improvements in size and efficiency > > > > See `perldoc perlre` and search for /\\u/, /\\U/, /\\l/, and /\\L/. > > > > > > -- > > Just my 0.0002 million dollars worth, > > Shawn > > > > Programming is as much about organization and communication > > as it is about coding. > > > > The secret to great software: Fail early & often. > > > > Eliminate software piracy: use only FLOSS. > > > > -- > > To unsubscribe, e-mail: beginners-unsubscr...@perl.org > > For additional commands, e-mail: beginners-h...@perl.org > > http://learn.perl.org/ > > Shawn and John, > > thanks, your leads gave me this: > > # > #!/usr/bin/perl > Add strict and warnings. > print &surname($ARGV[0]) . "\n"; Don't use leading-ampersand in subroutine calls. Also see: http://perl-begin.org/tutorials/bad-elements/ Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ Funny Anti-Terrorism Story - http://shlom.in/enemy She's a hot chick. But she smokes. She can smoke as long as she's smokin'. Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/