Re: [libreoffice-l10n] New Language Dictionary/Spelling/AutoCorrect

2015-05-31 Thread Kevin Scannell
Hi Gio,

 The data files from crubadan.org are just raw frequency lists from
Guaraní pages crawled from the web.  They wouldn't be suitable for a
spell checker without some cleaning/editing.  Also, since Guaraní has
complicated word structure, an affix file would be important too.

  I worked a bit on a Guaraní spell checker with some people in
Paraguay almost 10 years ago.  I'll write you off-list and point you
to some resources.

Kevin




2015-05-29 18:11 GMT-05:00 Michael Bauer :
> Ah if it's crubadan then you should contact Kevin (http://borel.slu.edu/) -
> he will be able to tell you more.
>
> Also copying him in. Kevin, Giovanni is working on Guaraní and wants to
> include spellcheckers but is unsure as to what there is and what needs done.
>
> Michael
>
> Sgrìobh Giovanni Caligaris na leanas 29/05/2015 aig 23:40:
>
> I have found this two links. I downloaded the zips and open them and it
> looks like a spell checker. But I am not quite sure what it is.
>
> http://crubadan.org/ws/gug.html
> and
> http://crubadan.org/ws/gn.html
>
>
> --
> Akerbeltz
> Goireasan Gàidhlig air an lìon
> Fòn: +44-141-946 4437
> Facs: +44-141-945 2701
>
> Tha Gàidhlig aig a' choimpiutair agad, siuthad, feuch e!
> Iomadh rud eadar prògraman oifis, brabhsairean, predictive texting,
> geamannan is mòran a bharrachd. Tadhail oirnn aig www.iGàidhlig.net

-- 
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] New Language Dictionary/Spelling/AutoCorrect

2015-05-29 Thread Michael Bauer
Ah if it's crubadan then you should contact Kevin 
(http://borel.slu.edu/) - he will be able to tell you more.


Also copying him in. Kevin, Giovanni is working on Guaraní and wants to 
include spellcheckers but is unsure as to what there is and what needs done.


Michael

Sgrìobh Giovanni Caligaris na leanas 29/05/2015 aig 23:40:
I have found this two links. I downloaded the zips and open them and 
it looks like a spell checker. But I am not quite sure what it is.


http://crubadan.org/ws/gug.html
and
http://crubadan.org/ws/gn.html 


--
*Akerbeltz *
Goireasan Gàidhlig air an lìon
Fòn: +44-141-946 4437
Facs: +44-141-945 2701

*Tha Gàidhlig aig a' choimpiutair agad, siuthad, feuch e!*
Iomadh rud eadar prògraman oifis, brabhsairean, predictive texting,
geamannan is mòran a bharrachd. Tadhail oirnn aig www.iGàidhlig.net 



--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] New Language Dictionary/Spelling/AutoCorrect

2015-05-29 Thread Giovanni Caligaris

Hello Rimas

Thanks for answering


It seems there is (or at least
has been) a project on SourceForge to create such dictionaries for
Guarani: http://sourceforge.net/projects/guarani/. You may try
contacting the people behind that project to see if they have produced
anything of use. Even if not, perhaps they would be interested in
restarting.

I checked the link and it seems that it was a project that never started.
I have found this two links. I downloaded the zips and open them and it 
looks like a spell checker. But I am not quite sure what it is.


http://crubadan.org/ws/gug.html
and
http://crubadan.org/ws/gn.html


Autocorrect is a third type of dictionary, which is also different than
the other two. I think creating this dictionary might be the easiest
task of the three: I have written a basic online tool in PHP which could
help you crowdsource suggestions for autocorrect. At least that's what
it does for me.

How can I create this?

-Gio


On 29/05/15 17:18, Rimas Kudelis wrote:

Hi Giovanni,


2015.05.29 19:24, Giovanni Caligaris wrote:

I would like to have a Guarani dictionary/spelling/autocorrect in the
future. I was wondering how to do it. Is it different for every OS
(linux/win/mac)? Is it a Pootle project?

No, these are different projects, which have nothing to do with Pootle.

LibreOffice, just like a bunch of software, uses Hunspell and/or MySpell
(http://hunspell.sourceforge.net/) dictionaries for spell checking. This
is regardless of the operating system in use. Making a spellcheck
dictionary is quite a big task by itself. It seems there is (or at least
has been) a project on SourceForge to create such dictionaries for
Guarani: http://sourceforge.net/projects/guarani/. You may try
contacting the people behind that project to see if they have produced
anything of use. Even if not, perhaps they would be interested in
restarting.

If there is a spellchecker dictionary available for Guarani, with an
acceptable license, it might be possible to convert its files into
MySpell/HunSpell and use them.


Hyphenation dictionaries are another thing: they use a different format
and different data than the spellchecker dictionaries. I know for a fact
that TeX hyphenation dictionaries might be converted to a format
suitable for LibO, but perhaps it's not the only possible conversion. If
there is nothing to convert, such dictionary would have to be done from
scratch.

By the way, hyphenation dictionaries can also be used in other projects
(e.g. OpenOffice, Mozilla)


Autocorrect is a third type of dictionary, which is also different than
the other two. I think creating this dictionary might be the easiest
task of the three: I have written a basic online tool in PHP which could
help you crowdsource suggestions for autocorrect. At least that's what
it does for me.

Hope this helps. Sorry for not posting any links. I hope someone better
familiar with our wiki can post pointers to further relevant information.

Regards,
Rimas





--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] New Language Dictionary/Spelling/AutoCorrect

2015-05-29 Thread Rimas Kudelis
Hi Giovanni,


2015.05.29 19:24, Giovanni Caligaris wrote:
> I would like to have a Guarani dictionary/spelling/autocorrect in the
> future. I was wondering how to do it. Is it different for every OS
> (linux/win/mac)? Is it a Pootle project?

No, these are different projects, which have nothing to do with Pootle.

LibreOffice, just like a bunch of software, uses Hunspell and/or MySpell
(http://hunspell.sourceforge.net/) dictionaries for spell checking. This
is regardless of the operating system in use. Making a spellcheck
dictionary is quite a big task by itself. It seems there is (or at least
has been) a project on SourceForge to create such dictionaries for
Guarani: http://sourceforge.net/projects/guarani/. You may try
contacting the people behind that project to see if they have produced
anything of use. Even if not, perhaps they would be interested in
restarting.

If there is a spellchecker dictionary available for Guarani, with an
acceptable license, it might be possible to convert its files into
MySpell/HunSpell and use them.


Hyphenation dictionaries are another thing: they use a different format
and different data than the spellchecker dictionaries. I know for a fact
that TeX hyphenation dictionaries might be converted to a format
suitable for LibO, but perhaps it's not the only possible conversion. If
there is nothing to convert, such dictionary would have to be done from
scratch.

By the way, hyphenation dictionaries can also be used in other projects
(e.g. OpenOffice, Mozilla)


Autocorrect is a third type of dictionary, which is also different than
the other two. I think creating this dictionary might be the easiest
task of the three: I have written a basic online tool in PHP which could
help you crowdsource suggestions for autocorrect. At least that's what
it does for me.

Hope this helps. Sorry for not posting any links. I hope someone better
familiar with our wiki can post pointers to further relevant information.

Regards,
Rimas


-- 
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted