Re: Esperanto dictionary (oops)

2007-04-20 Thread A.J.Mechelynck

A.J.Mechelynck wrote:
[...]
IIUC, the eo_EO region would use the "actual" letters with superscripts 
(as found in Latin3 or Unicode), and the eo_UX region would use method 
4, which is widely used on computer systems, including in e-mails 


...oops... s/method 4/the last method above/

emanating from the Universal Esperanto Association (i.e., the 
Esperantist headquarters) in Rotterdam.



Best regards,
Tony.


Best regards,
Tony.
--
Real Users hate Real Programmers.


Re: Esperanto dictionary

2007-04-20 Thread A.J.Mechelynck

Bram Moolenaar wrote:

Cyril Slobin wrote:


On 4/20/07, Bram Moolenaar <[EMAIL PROTECTED]> wrote:


Please take the existing $VIMRUNTIME/spell/eo/main.aap and modify it a
bit to build the .spl file.  This can't be very difficult, you would
mostly use the command you type manually.

OK, I'll try this. Probably tomorrow.


Good.


What is strange is that myspell uses eo_l3 and you have eo_EO and eo_UX.
Why two regions?

Esperanto language uses some letters from Latin3 character set. Of
course, they are in Unicode too. But during half-century in
ASCII-based world there was established some conventions for
transcribing these letters in pure ASCII. There are still some
disagreements which one is most popular, or most standard, or most
suitable, but I believe that "Cxirkaux"-convention is most widely used
(no, I can't prove this with statistics). The convention is named
"Cxirkaux" after transcription of the word "Ĉirkaŭ" (I hope you have
an appropriate font installed to read this). It is handy to be able to
check Esperanto text in both modes (or choose any one of two).
Probably to make two files -- eo.ascii.spl and eo.utf-8.spl -- will be
theoretically more pure, but my solution allows to switch between two
modes fast.


OK.  So when the user does ":set spl=eo_eo" he still gets the "pure"
version?  It's important that the user has a choice of what words he
wants to accept.



I think you got it. For any people who'd want a better understanding of the 
matter, here's what I could think of as an explanation:


The non-ASCII letters in the Esperanto variant of the Latin alphabet are the 
following (in upper and lower case):


Ĉĉ, C-circumflex
Ĝĝ, G-circumflex
Ĥĥ, H-circumflex
Ĵĵ, J-circumflex
Ŝŝ, S-circumflex
Ŭŭ, U-breve

There are at several known ways to transliterate them to ASCII, and 
Esperantists have been arguing without end about which one was "the better", 
with partisans of the first and last ones below flaming each other for weeks 
on end in the Usenet group soc.culture.esperanto and elsewhere:


* "h-method". In the original work which brought Esperanto to the public in 
1887, the work "The International Language" in five languages (Russian, 
Polish, French, German, and English) by "D-ro Esperanto", a nom-de-plume of 
Dr. Louis Lazarus Zamenhof, there was an additional sentence under the 
"Alphabet" (I'm quoting from the English edition):


*Remark.* -- If it be found impracticable to print works with the
diacritical signs (^,˘), the letter h may be substituted for the sign
(^), and the sign (˘) may be omitted altogether.

IOW: ch, gh, hh, jh, sh, u.

One problem of this "official" or "Fundamental" notation is that it creates 
ambiguities between u and u-breve, and between letter+circumflex and letter+h. 
The latter can be found in e.g. chashundo (chas- +hundo, "a hunting dog"), 
danchalo (danc- + halo, "a ballroom"), etc. It is, however, most 
"natural-looking" in that the most-used of these, c-circumflex and 
s-circumflex, have exactly the sound of English ch and sh; and u-breve, which 
represents the semivowel [w], is used almost exclusively after a vowel, to 
form the second part of the closing diphongs [au] [eu] and, rarely, [ou]. 
(U-breve also occurs initially in a few imported words like uato 
"(hydrophilic) cotton; cotton wool", from French "ouate".)


Examples: chasi "to hunt", ghardeno "a garden", ehho "an echo", bovajho 
"beef", shajni "to seem", preskau "almost".


* "Slavic" method (for use on Latin typewriters for East-European countries): 
replace the circumflex or breve by a caron (a superscript similar in shape to 
the letter v). This method, though "unofficial" like all the ones below, is 
also somewhat "natural-looking", at least for Slavic-language people.


* "pre-circumflex method": ^c, ^g, ^h, ^j, ^s, ^u

* "post-circumflex method": c^, g^, h^, j^, s^, u^

* "x-method": cx, gx, hx, jx, sx, ux. This one has been widely used on 
computer systems (see at bottom) but it requires _three_ (not two) case 
variants for each letter: uppercase (for titles in all-caps), CX GX HX JX SX 
UX; titlecase (for the first letter of a sentence or of a proper noun etc.), 
Cx Gx Hx Jx Sx Ux; lowercase, cx gx hx jx sx ux.


All the above except the first avoid ambiguities, because Esperanto doesn't 
use the letter X (or a freestanding circumflex); but they (especially the 
latter three) have been variously described as "ugly" and as "contrary to the 
«Fundamento de Esperanto»".


Add to these, any of the above except the last, with u-breve replaced by 
u-grave (which, like the dead-key circumflex, can be found on any typewriter 
for the French language).


IIUC, the eo_EO region would use the "actual" letters with superscripts (as 
found in Latin3 or Unicode), and the eo_UX region would use method 4, which is 
widely used on computer systems, including in e-mails emanating from the 
Universal Esperanto Association (i.e., the Esperantist headquarters) in Rotterdam.



Best re

Re: Esperanto dictionary

2007-04-20 Thread Cyril Slobin

On 4/21/07, Bram Moolenaar <[EMAIL PROTECTED]> wrote:


OK.  So when the user does ":set spl=eo_eo" he still gets the "pure"
version?  It's important that the user has a choice of what words he
wants to accept.


Yes. Setting eo_eo accepts unicode "Ĉirkaŭ" only, setting eo_ux accepts
ascii "Cxirkaux" only, setting eo accepts both.

And again -- I have not tested latin3 version of spl file, I never use latin3.

--
Cyril Slobin <[EMAIL PROTECTED]> `When I use a word,' Humpty Dumpty said,
 `it means just what I choose it to mean'


Re: Esperanto dictionary

2007-04-20 Thread Bram Moolenaar

Cyril Slobin wrote:

> On 4/20/07, Bram Moolenaar <[EMAIL PROTECTED]> wrote:
> 
> > Please take the existing $VIMRUNTIME/spell/eo/main.aap and modify it a
> > bit to build the .spl file.  This can't be very difficult, you would
> > mostly use the command you type manually.
> 
> OK, I'll try this. Probably tomorrow.

Good.

> > What is strange is that myspell uses eo_l3 and you have eo_EO and eo_UX.
> > Why two regions?
> 
> Esperanto language uses some letters from Latin3 character set. Of
> course, they are in Unicode too. But during half-century in
> ASCII-based world there was established some conventions for
> transcribing these letters in pure ASCII. There are still some
> disagreements which one is most popular, or most standard, or most
> suitable, but I believe that "Cxirkaux"-convention is most widely used
> (no, I can't prove this with statistics). The convention is named
> "Cxirkaux" after transcription of the word "Ĉirkaŭ" (I hope you have
> an appropriate font installed to read this). It is handy to be able to
> check Esperanto text in both modes (or choose any one of two).
> Probably to make two files -- eo.ascii.spl and eo.utf-8.spl -- will be
> theoretically more pure, but my solution allows to switch between two
> modes fast.

OK.  So when the user does ":set spl=eo_eo" he still gets the "pure"
version?  It's important that the user has a choice of what words he
wants to accept.

-- 
I recommend ordering large cargo containers of paper towels to make up
whatever budget underruns you have.  Paper products are always useful and they
have the advantage of being completely flushable if you need to make room in
the storage area later.
(Scott Adams - The Dilbert principle)

 /// Bram Moolenaar -- [EMAIL PROTECTED] -- http://www.Moolenaar.net   \\\
///sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\download, build and distribute -- http://www.A-A-P.org///
 \\\help me help AIDS victims -- http://ICCF-Holland.org///


Re: Esperanto dictionary

2007-04-20 Thread Cyril Slobin

On 4/20/07, Bram Moolenaar <[EMAIL PROTECTED]> wrote:


Please take the existing $VIMRUNTIME/spell/eo/main.aap and modify it a
bit to build the .spl file.  This can't be very difficult, you would  mostly use
the command you type manually.


OK, I'll try this. Probably tomorrow.


What is strange is that myspell uses eo_l3 and you have eo_EO and eo_UX.
Why two regions?


Esperanto language uses some letters from Latin3 character set. Of
course, they are in Unicode too. But during half-century in
ASCII-based world there was established some conventions for
transcribing these letters in pure ASCII. There are still some
disagreements which one is most popular, or most standard, or most
suitable, but I believe that "Cxirkaux"-convention is most widely used
(no, I can't prove this with statistics). The convention is named
"Cxirkaux" after transcription of the word "Ĉirkaŭ" (I hope you have
an appropriate font installed to read this). It is handy to be able to
check Esperanto text in both modes (or choose any one of two).
Probably to make two files -- eo.ascii.spl and eo.utf-8.spl -- will be
theoretically more pure, but my solution allows to switch between two
modes fast.

--
Cyril Slobin <[EMAIL PROTECTED]> `When I use a word,' Humpty Dumpty said,
 `it means just what I choose it to mean'


Re: Esperanto dictionary

2007-04-19 Thread Bram Moolenaar

Cyril Slobin wrote:

> Who maintains Esperanto spell files for Vim? File eo.utf-8.spl is
> completely broken! In fact it was broken long ago when I've download
> Vim 7.0. Now I've upgrade to 7.0.219 and have checked if something
> became better. No hope -- it is the same broken file. I use Win32
> version of Vim.

There is no maintainer.  I simply took the spell files from myspell
(OpenOffice.org).  They are still dated 27-Oct-2005, thus it appears
nobody is working on them.

> I have complied my own eo.utf-8.spl from ispell sources by Sergio
> Pokrovskij found in Debian 3.1 distribution. It understands both real
> Unicode and surrogate "Cxirkaux"-style (if you don't speak Esperanto,
> you don't need to understand this). Archive contains .spl file itself,
> two .dic files, two .aff files and short readme file (it is in
> Esperanto, not English, and named "legumin", not "readme"). You can
> download it from:
> 
> http://www.45.free.net/~slobin/vim/eo.utf-8.zip
> 
> Maybe it is a good idea to replace broken file with my one on Vim ftp site.
> 
> I've newer use aap and don't know vim maintaining technology, I've
> just manually converted ispell files to myspell ones and than compiled
> them to Vim format.
> 
> I have not checked eo.iso-8859-3.spl file, I newer use iso-8859-3.

To be able to allow others to reproduce building the .spl file, it's
required that a script is used to fetch the input files, do any
conversions/patching and use Vim to build the .spl file.

Please take the existing $VIMRUNTIME/spell/eo/main.aap and modify it a
bit to build the .spl file.  This can't be very difficult, you would
mostly use the command you type manually.

What is strange is that myspell uses eo_l3 and you have eo_EO and eo_UX.
Why two regions?

-- 
I once paid $12 to peer at the box that held King Tutankhamen's little
bandage-covered midget corpse at the De Young Museum in San Francisco.  I
remember thinking how pleased he'd be about the way things turned out in his
afterlife.
(Scott Adams - The Dilbert principle)

 /// Bram Moolenaar -- [EMAIL PROTECTED] -- http://www.Moolenaar.net   \\\
///sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\download, build and distribute -- http://www.A-A-P.org///
 \\\help me help AIDS victims -- http://ICCF-Holland.org///


Re: Esperanto dictionary

2007-04-04 Thread Hugh Sasse
On Wed, 4 Apr 2007, Cyril Slobin wrote:

> Seems like this letter doesn't reached the list. Reposting. I'm sorry
> if it appears twice.
> 
> On 4/2/07, A.J.Mechelynck <[EMAIL PROTECTED]> wrote:
> 
> > Well, I suppose both uppercase and titlecase should be supported then. Cxu
> ne?
> > CXU VERE NE? (Kompreneble, ??iukaze mi preferas "verajn" ??apelitajn
> literojn.)
> 
> CXU, Cxu and cxu are all passed cheking, CXu doesn't. And I believe
> this is a Right Thing.

I'll concede this: I'm hardly an expert! :-)
> 
> > I suppose texts written in "??Fundamenta?? h-stilo" could emphasise the
> radical
> > break when needed, as in flug-haveno, chas-hundo, danc-halo, ktp. (er,
> etc.).
> 
> Just checked -- translation table used by my plugin knows about
> flughaveno and chashundo, but not about danchalo. I don't write this
> table myself, but borrow it from UniRed (another opensource editor).
> Anyway you can easy add danchalo and any other such word in the table
> by yourself -- it is in simple text format.
> 
> BTW x-style is not free from such problems. Pure Esperanto text is OK,
> but consider you use the word "Linux" in it!

:-)  Linukso, (tiel Vindozo, Unikso...) I think

http://fagot.alain.free.fr/KompLeks/UTF8/INDL.html
Hugh

Re: Esperanto dictionary

2007-04-03 Thread Cyril Slobin

On 4/2/07, A.J.Mechelynck <[EMAIL PROTECTED]> wrote:


Well, I suppose both uppercase and titlecase should be supported then. Cxu ne?
CXU VERE NE? (Kompreneble, ĉiukaze mi preferas "verajn" ĉapelitajn literojn.)


CXU, Cxu and cxu are all passed cheking, CXu doesn't. And I believe
this is a Right Thing.


I suppose texts written in "«Fundamenta» h-stilo" could emphasise the radical
break when needed, as in flug-haveno, chas-hundo, danc-halo, ktp. (er, etc.).


Just checked -- translation table used by my plugin knows about
flughaveno and chashundo, but not about danchalo. I don't write this
table myself, but borrow it from UniRed (another opensource editor).
Anyway you can easy add danchalo and any other such word in the table
by yourself -- it is in simple text format.

BTW x-style is not free from such problems. Pure Esperanto text is OK,
but consider you use the word "Linux" in it!

--
Cyril Slobin <[EMAIL PROTECTED]> `When I use a word,' Humpty Dumpty said,
 `it means just what I choose it to mean'


Re: Esperanto dictionary

2007-04-03 Thread Hugh Sasse
On Tue, 3 Apr 2007, A.J.Mechelynck wrote:

> 
> Some years ago, I wrote the chapter of the Vim FAQ about Unicode: browse to
> http://vimdoc.sourceforge.net/htmldoc/vimfaq.html and scroll to the last
> section, e.g. by searching the page for the string SECTION 37 (which happens
> twice, once in the table of contents and once at the head of the section
> itself).

Yes, there's good stuff there.  I'm not entirely sure how all those things
will interact but having them all together gives me scope for experimentation.
Thank you.

Hugh


Re: Esperanto dictionary

2007-04-03 Thread A.J.Mechelynck

Hugh Sasse wrote:
[...]
My problem is that I mainly work through Windows systems (often ssh into 
Solaris, but still) and I don't have a clue what to do with fonts for all

this, E.g. in PuTTY.  I'm not entirely clear how to do this in gvim for that
matter.  I've read some of the help on UTF8 but I'm still rather confused
being very much at the Beginner stage for this in terms of the Dreyfus
model of skills aquistion
http://www.pragmaticprogrammer.com/articles/cook_until_done.html
so if someone has a really gentle introduction to all this I'd be grateful.
I've noticed that Word stores things in UTF-16 (LOTS of nulls :-)) so
this should be achievable, but


Best regards,
Tony.


Thank you,
Hugh


Some years ago, I wrote the chapter of the Vim FAQ about Unicode: browse to 
http://vimdoc.sourceforge.net/htmldoc/vimfaq.html and scroll to the last 
section, e.g. by searching the page for the string SECTION 37 (which happens 
twice, once in the table of contents and once at the head of the section itself).


I just reread that section, which is a series of short explanations about the 
things most important to use Unicode in Vim, with links to the appropriate 
help topics and in a few cases to documentation elsewhere on the Web. Apart 
from a few typos, it can still be regarded as accurate. (I didn't check the 
external links though; if you find one that is broken, report it to Yegappan, 
the maintainer of that FAQ.)


Best regards,
Tony.
--
I'm changing my name to Chrysler
I'm going down to Washington, D.C.
I'll tell some power broker
What they did for Iacocca
Will be perfectly acceptable to me!
I'm changing my name to Chrysler,
I'm heading for that great receiving line.
When they hand a million grand out,
I'll be standing with my hand out,
Yessir, I'll get mine!
-- Tom Paxton


Re: Esperanto dictionary

2007-04-03 Thread Hugh Sasse
On Mon, 2 Apr 2007, A.J.Mechelynck wrote:

> Cyril Slobin wrote:
> > On 4/2/07, Hugh Sasse <[EMAIL PROTECTED]> wrote:

[Info about plugin trimmed. Thank you.

> > > Also isn't your example often written "CXirkaux" because the CX is
> > > (effectively) one character, capitalized?
> > 
> > I've newer seen this form, and I believe it is ugly. And in unicode
> > terms, this one character is not capitalized, but title-cased.

I've seen it used on the web, but it's net easy to search for :-).
> > 
> 
> Well, I suppose both uppercase and titlecase should be supported then. Cxu ne?

I've not encountered "titlecase" before this thread, so I don't
understand its semantics yet.

> CXU VERE NE? (Kompreneble, ??iukaze mi preferas "verajn" ??apelitajn
> literojn.)
> 
> I suppose texts written in "??Fundamenta?? h-stilo" could emphasise the
> radical break when needed, as in flug-haveno, chas-hundo, danc-halo, ktp. (er,
> etc.). Anyway, I anticipate that all substitution schemes will become less and
> less necessary as Unicode generalizes: e.g., my fr_BE keyboard supports
> consonants with circumflex "out of the box" in openSUSE Linux 10.2 (thus going
> back to the "universality" of the French typewriters of Zamenhof's time ;-) ).

My problem is that I mainly work through Windows systems (often ssh into 
Solaris, but still) and I don't have a clue what to do with fonts for all
this, E.g. in PuTTY.  I'm not entirely clear how to do this in gvim for that
matter.  I've read some of the help on UTF8 but I'm still rather confused
being very much at the Beginner stage for this in terms of the Dreyfus
model of skills aquistion
http://www.pragmaticprogrammer.com/articles/cook_until_done.html
so if someone has a really gentle introduction to all this I'd be grateful.
I've noticed that Word stores things in UTF-16 (LOTS of nulls :-)) so
this should be achievable, but

> 
> Best regards,
> Tony.

Thank you,
Hugh

Re: Esperanto dictionary

2007-04-02 Thread A.J.Mechelynck

Cyril Slobin wrote:

On 4/2/07, Hugh Sasse <[EMAIL PROTECTED]> wrote:


It might be useful to also support C^irkau^ as well.  I'm not sure
how often the h form is used given the exception(s?) (flughaveno...)


For h form you can use my plugin:

   http://www.vim.org/scripts/script.php?script_id=1761

It converts misc ascii representations to unicode and vice versa.
Among others are supported Cxirkaux-style, Zamenhof style with h (and
it knows about flughaveno and chashundo!), html/xml entities,
tex/latex notation and many more... If you want to spell check text
written with h's, you just convert it to unicode, check, and convert
back. Plugin is table-driven, and I haven't write tables myself -- I
borrowed them from two other open-source projects (UniRed and catdoc).
UniRed also has tables for ^Cirka^u, C^irkau^ and C`irkau`, and plugin
can use them, but I haven't bundled with plugin.


Also isn't your example often written "CXirkaux" because the CX is
(effectively) one character, capitalized?


I've newer seen this form, and I believe it is ugly. And in unicode
terms, this one character is not capitalized, but title-cased.



Well, I suppose both uppercase and titlecase should be supported then. Cxu ne? 
CXU VERE NE? (Kompreneble, ĉiukaze mi preferas "verajn" ĉapelitajn literojn.)


I suppose texts written in "«Fundamenta» h-stilo" could emphasise the radical 
break when needed, as in flug-haveno, chas-hundo, danc-halo, ktp. (er, etc.). 
Anyway, I anticipate that all substitution schemes will become less and less 
necessary as Unicode generalizes: e.g., my fr_BE keyboard supports consonants 
with circumflex "out of the box" in openSUSE Linux 10.2 (thus going back to 
the "universality" of the French typewriters of Zamenhof's time ;-) ).


Best regards,
Tony.
--
How can you be in two places at once when you're not anywhere at all?


Re: Esperanto dictionary

2007-04-02 Thread Cyril Slobin

On 4/2/07, Hugh Sasse <[EMAIL PROTECTED]> wrote:


It might be useful to also support C^irkau^ as well.  I'm not sure
how often the h form is used given the exception(s?) (flughaveno...)


For h form you can use my plugin:

   http://www.vim.org/scripts/script.php?script_id=1761

It converts misc ascii representations to unicode and vice versa.
Among others are supported Cxirkaux-style, Zamenhof style with h (and
it knows about flughaveno and chashundo!), html/xml entities,
tex/latex notation and many more... If you want to spell check text
written with h's, you just convert it to unicode, check, and convert
back. Plugin is table-driven, and I haven't write tables myself -- I
borrowed them from two other open-source projects (UniRed and catdoc).
UniRed also has tables for ^Cirka^u, C^irkau^ and C`irkau`, and plugin
can use them, but I haven't bundled with plugin.


Also isn't your example often written "CXirkaux" because the CX is
(effectively) one character, capitalized?


I've newer seen this form, and I believe it is ugly. And in unicode
terms, this one character is not capitalized, but title-cased.

--
Cyril Slobin <[EMAIL PROTECTED]> `When I use a word,' Humpty Dumpty said,
 `it means just what I choose it to mean'


Re: Esperanto dictionary

2007-04-02 Thread Hugh Sasse
On Sat, 31 Mar 2007, Cyril Slobin wrote:

> Hi all!
[...] 
> I have complied my own eo.utf-8.spl from ispell sources by Sergio
> Pokrovskij found in Debian 3.1 distribution. It understands both real
> Unicode and surrogate "Cxirkaux"-style (if you don't speak Esperanto,

It might be useful to also support C^irkau^ as well.  I'm not sure
how often the h form is used given the exception(s?) (flughaveno...)
Also isn't your example often written "CXirkaux" because the CX is 
(effectively) one character, capitalized?

Anyway, nice to see someone working on this stuff.  

Hugh



Re: Esperanto dictionary

2007-03-30 Thread A.J.Mechelynck

Cyril Slobin wrote:

Hi all!

Who maintains Esperanto spell files for Vim? File eo.utf-8.spl is
completely broken! In fact it was broken long ago when I've download
Vim 7.0. Now I've upgrade to 7.0.219 and have checked if something
became better. No hope -- it is the same broken file. I use Win32
version of Vim.

I have complied my own eo.utf-8.spl from ispell sources by Sergio
Pokrovskij found in Debian 3.1 distribution. It understands both real
Unicode and surrogate "Cxirkaux"-style (if you don't speak Esperanto,
you don't need to understand this). Archive contains .spl file itself,
two .dic files, two .aff files and short readme file (it is in
Esperanto, not English, and named "legumin", not "readme"). You can
download it from:

   http://www.45.free.net/~slobin/vim/eo.utf-8.zip

Maybe it is a good idea to replace broken file with my one on Vim ftp site.

I've newer use aap and don't know vim maintaining technology, I've
just manually converted ispell files to myspell ones and than compiled
them to Vim format.

I have not checked eo.iso-8859-3.spl file, I newer use iso-8859-3.



That file is no less broken. I'm sending you the details in a private email in 
Esperanto.


Best regards,
Tony.
--
Oregano, n.:
The ancient Italian art of pizza folding.