I posted the following question on the vi/vim stack exchange
<http://vi.stackexchange.com/questions/5452/set-line-breaks-word-wraps-and-word-searching-for-thai-and-other-non-latin-lang>
and was told that the vim-dev mailing list would be a more appropriate
place to ask.

Brian

It is edited here as best I can with the assumption that the entered text
is utf-8.

My purpose is for a Thai solution, but instead of a hack, a more general
solution should be available that will help the more than 1 Billion people
of the various Indic languages.
****

I can set the text width and can manually line break imported paragraphs
with the following as an example.

set textwidth=72
gqq

I can also navigate English text files with the standard 'w' 'b' 'e' '*'
commands, etc.

This works well for English, however Thai and other Brahmic scripts of
South and South-east Asia space at the phrasal level. Libreoffice, Word,
Indesign, TeX, etc. "know" where line breaks should occur. They also "know"
where individual words are, even though there are no spaces. I can navigate
by Thai word in these programs. And I can even type English, Thai and Lao
in the chrome address bar and then use alternate arrow on my mac to
navigate at the word level in all three of these languages. It seems that
these programs are tapping into work that has already been done at some
lower level. If vim could tap into the same work, then someone could edit a
multi-language document without having to do anything fancy. 'w' 'dw'
(etc.) would just work happily from one word to the next regardless of the
language.

Line breaking poses a different challenge as these languages space at the
phrasal level so that the trailing space or absence of a trailing space at
the end of the line has meaning when breaking and joining lines. For
purpose of example, the spaces are similar to an oxford comma and other
punctuation and is the difference of whether or not we had Grandma for
breakfast. (Let's eat Grandma. vs. Let's eat, Grandma.) One, also, doesn't,
want, random, spaces, coming, when, they, are, not, needed.

*My question is two fold:* 1. How can vim tap into already available
libraries in order to recognize words from Indic languages (including and
especially Thai) for the purpose of navigation and other vim word level
commands. 2. Is it possible to add language awareness for the purpose of
line breaking so that vim does not strip/add spaces when breaking/joining
lines at words in Thai or other Indic languages.

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to vim_dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Raspunde prin e-mail lui