> > - line break (wrapping lines on the screen)
> > - word break (for selection)
> > - word/root extraction (for search)
>
> I recognize that the second and third case are really
> difficult to handle.
Root extraction is decidecly non-trivial and a highly language-specific
problem, even more so than word breaking, it's a messy linguistic problem
instead of a clean algoritmic problems.
To start with, the choice of the term "extraction" shows that one has not
understood the problem in all its g(l)ory: a more appropriate term would be
"finding", or maybe, "reducing" the root.
Also, I would add
- "syllablization" (is that a word?) as a third problem (for breaking words
more nicely into lines), it would rank in difficulty somewhere between word
breaking and root extraction.
> But for word wrapping I assume line
> breaking is sufficient. But when I don't have spaces to use
> for wrapping and/or don't know whether the actual text part
> uses spaces at all (what about exotic languages like Ogham or
> Anglo-saxon?) then how can I go to implement word wrapping?
> Simply do it character by character?