https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67556
Bug ID: 67556 Summary: Regex \w doesn't support the unicode character <U+200C> Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: mm.masaeli at gmail dot com Target Milestone: --- The Unicode character 200C is called "nimfaseleh" in Persian language and it is important to deal with it as a regular character. The trick is, it never comes in the beginning or ending of a word. For example in a word like the following, it is crucial: میبینم