Hi Mike, I appreciate the quick reply. I am familiar with the Unicode::Normalize module (and will also be using that), but I left it out of this question because it's not relevant to the problem I'm currently trying to solve. The text I'm trying to strip diacritics out of does not have precomposed accented characters.
-- Michael # Michael Doran, Systems Librarian # University of Texas at Arlington # 817-272-5326 office # 817-688-1926 cell # [EMAIL PROTECTED] # http://rocky.uta.edu/doran/ -----Original Message----- From: Mike Rylander [mailto:[EMAIL PROTECTED] Sent: Mon 5/5/2008 8:52 PM To: Doran, Michael D Cc: [EMAIL PROTECTED]; Perl4lib Subject: Re: Stripping out Unicode combining characters (diacritics) On Mon, May 5, 2008 at 8:26 PM, Doran, Michael D <[EMAIL PROTECTED]> wrote: [snip] > > I'm pulling my hair out on this... so any help would be appreciated. If > there's any other info I can provide, let me know. > You'll want to transform the text to NFD format (nominally, base characters plus combining marks) instead of NFC (precombined characters) using Unicode::Normalize: use Unicode::Normalize; my $text = NFD($original); $text =~ s/\pM+//go; Hope that helps. -- Mike Rylander | VP, Research and Design | Equinox Software, Inc. / The Evergreen Experts | phone: 1-877-OPEN-ILS (673-6457) | email: [EMAIL PROTECTED] | web: http://www.esilibrary.com