On Mon, 28 Jan 2019 03:48:52 +0000 James Kass via Unicode <unicode@unicode.org> wrote:
> It’s been said that the text segmentation rules seem over-complicated > and are probably non-trivial to implement properly. I tried your > suggestion of WORD JOINER U+2060 after tau ( γένοιτ’ ἄν ), but it > only added yet another word break in LibreOffice. I said we *don't* have a control that joins words. The text of TUS used to say we had one in U+2060, but that was removed in 2015. I pleaded for the retention of this functionality in document L2/2015/15-192, but my request was refused. I pointed out in ICU ticket #11766 that ICU's Thai word breaker retained this facility. An investigation was planned, but nothing seems to have come of it. Interestingly, bringing this word breaker into line with TUS in the UK may well be in breach of the Equality Act 2010. Richard.