Jeff,

Thanks for taking the time to look over my problem.  In the end, I did end
up using your idea on the map ( scalar reverse).  It worked like a champ!
Thanks again.

Trevor

-----Original Message-----
From: Jeff 'japhy' Pinyan [mailto:[EMAIL PROTECTED]
Sent: Thursday, August 07, 2003 6:19 PM
To: Trevor Morrison
Cc: [EMAIL PROTECTED]
Subject: Re: Regex Pattern


On Aug 7, Trevor Morrison said:

>I am trying to use regex to extract the city,state and zip out of a file.
>Now, the problem is the city can have more then one word to it like
>
>San Antonio
>San Francisco
>
>I have also, bumped into case of 4 or 5 words in the city name!  I am
>looking for a regex expression that will take this into account knowing
that
>the form of the line is:
>
>city[space]state(2 letters)[space] zip (could be 5 or 9 digits)
>
>Loves Park IL 61111
>APO  AE NY 09012
>St. George UT 84770
>Columbus OH 43202
>Salt Lake City UT 84118

The simplest way is to ignore what form the city might take, and deal with
the knowns -- the state will be two uppercase letters, and the zip code
will be five to nine digits.

  my ($city, $state, $zip) = $line =~ /^(.*) ([A-Z]{2}) (\d{5,9})$/;

This assumes the fields are separated by a space.  It also only checks for
AT LEAST 4 and AT MOST 9 digits, so it would let a 7-digit zip code
through.  If you want to be more robust, the last part of the regex could
be

  (\d{5}(?:\d{4})?)

which ensures 5 digits, and then optionally matches 4 more.  Then again,
maybe just

  (\d{9}|\d{5})

is simpler on the eyes and brain.

Another approach, if you don't really care about the format of the lines,
and you just want to extract the fields regardless of HOW they look, is to
reverse the line, split it into three pieces, and then reverse each of
those pieces.  A line like "Salt Lake City UT 54321" would be reversed to
"12345 TU ytiC ekaL tlaS", split into three pieces ("12345", "TU", and
"ytiC ekaL tlaS"), and then each of those pieces would be reversed again,
to give back "54321", "UT", and "Salt Lake City".

  my ($zip, $state, $city) =
    map { scalar reverse }  # reverse (in scalar context) (it's important)
    split ' ', (reverse $line), 3;

But maybe that's more than you need.

--
Jeff "japhy" Pinyan      [EMAIL PROTECTED]      http://www.pobox.com/~japhy/
RPI Acacia brother #734   http://www.perlmonks.org/   http://www.cpan.org/
<stu> what does y/// stand for?  <tenderpuss> why, yansliterate of course.
[  I'm looking for programming work.  If you like my work, let me know.  ]


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to