Regex problem with accented characters

2007-03-27 Thread Beginner
Hi, I am trying to extract the iso code and country name from a 3 column table (taken from en.wikipedia.org) and have noticed a problem with accented characters such as Ô. Below is my script and a sample of the data I am using. When I run the script the code beginning CI for Côte d'Ivoire

Re: Regex problem with accented characters

2007-03-27 Thread Mumia W.
On 03/27/2007 03:34 AM, Beginner wrote: Hi, I am trying to extract the iso code and country name from a 3 column table (taken from en.wikipedia.org) and have noticed a problem with accented characters such as Ô. Below is my script and a sample of the data I am using. When I run the script

Re: Regex problem with accented characters

2007-03-27 Thread Rob Dixon
Beginner wrote: Hi, I am trying to extract the iso code and country name from a 3 column table (taken from en.wikipedia.org) and have noticed a problem with accented characters such as Ô. Below is my script and a sample of the data I am using. When I run the script the code beginning CI

Re: Regex problem with accented characters

2007-03-27 Thread Rob Dixon
Beginner wrote: /^(\w{2})\s+(\w+\s\w+\s\w+s\w+|\w+\s\w+\s\w+|\w+\s\w+|\w+)/); It's worth noting that this could be written: /^(\w{2})\s+(\w+(?:\s\w+)*)/); Rob -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/