Hyphenation problem in Bug 23985
Actually, implementing UTR14 would solve the line breaking problem, although not the URL breaking problem.
Points to discuss: - JDK 1.4 has a java.text.BreakIterator, which implements UTR14, or at least the parts releant for western languages (can't see them dealing with Thai properly). The questions + Can somebody verify this class is already available in JDK 1.3? I already deletetd the 1.3 docs, and can't be bothered to reinstall them. + Can this class really be leveraged? We probably need to supply a CharacterIterator which computes running line width into a global state, and check after each return of the iterator wehther the line is full. This might fit well with getNextBreak(), but I have difficulties to see how this would interact with hyphenation. - Should we provide for custom line breaking algorithms? Some languages/scripts like Thai almost certainly require augmenting any stock line breaking algorithms. However, the problem seems to be more clever breaking of non-natural-languaage stuff, like URL. We can leave this completely to the FO creators, forcing them for example + use language="x-url" to turn off hyphenation locally + use glue characters line NBZWS to keep the stock line breaking algorithm to break after slashes The latter is quite intrusive.
I've got my own UTR14 implementation (simplified, of course), which should appear on http://cvs.apache.org/~pietsch later this evening for review. It uses a LineBreakStatus object for tracking the status, which might be folded into the LayoutContext or a subclass used for inline FOs and text.
Comments?
J.Pietschmann