Along the same lines, we might need a MODIFIER LETTER HYPHEN, because, for example, the work ack-ack isn't decomposable into words, or even morphemes, "ack" and "ack".
Leo On Thu, Jun 4, 2015 at 6:31 PM, David Starner <prosfil...@gmail.com> wrote: > On Thu, Jun 4, 2015 at 2:38 PM Markus Scherer <markus....@gmail.com> > wrote: > >> "don’t" is a contraction of two words, it is not one word. >> > > But as he points out, it's not a contraction of don and t; it is, at best, > a contraction of do and n't. It's eliding, not punctuating. In the > comments, he also brings up the examples of "Don’t you mind?" being okay > but not *"Do not you mind?", and "fo’c’sle". > > > You can't use simple regular expressions to find word boundaries. > > Who uses _simple_ regular expressions? You can't use any code to reliably > find word boundaries in English, and that's a problem. >