Hi,

I'm forwarding the listed email to the list.

Seems to me that there is an opportunity here to get these dictionary
files up to the LibO extension site...shall we take it?

//drew

-------- Forwarded Message --------
From: Tim Lungstrom <timo...@lungstrom.com>
To: Florian Effenberger <flo...@documentfoundation.org>
Cc: drew <d...@baseanswers.com>, m...@marcpare.com, Tom Davies
<tomdavie...@yahoo.co.uk>, carlsym...@gmail.com, italo.vign...@gmail.com
Subject: Re: en_US dictionary with April 19 2011 .dic file, plus French
updates possible as well
Date: Thu, 30 Jun 2011 10:49:34 -0400

On 06/30/2011 06:13 AM, Florian Effenberger wrote:
> Hello,
>
> can someone follow-up on this, or did someone already do? It's a bit out
> of my scope, I guess :-)
>
> Florian
>
Attached are the most up-to-date French dictionaries, so far, plus the 
en_US one listed in the subject line.

Three of French dictionaries have not been uploaded, do to it slipping 
through cracks in my memory after I downloaded the newest ones.  I was 
going to manually update the French ones with the newest .dic word list 
files, since OpenOffice.org's site was offline more often than running.  
Then a few weeks later, that site was up enough for me to grab as many 
of the updated dictionaries as I could.

If needed, I can take newer .dic word lists that are available and 
replace the older ones to try to keep the dictionaries up-to-date.  The 
en_US .dic file was never updated with the newest one[s] that I found, 
so I created the attached one.

The newest en_US .dic file that was on the OOo site was dated 
2010-03-15, while the "default" installed one was dated 2007-05-04, as 
far as I can see.  The newest .dic file I found was dated 2011-04-19, so 
that was as at least one year newer than the newest one on the OOo site 
and almost four years newer than the "default" installed one [as far as 
I can tell about the default installed file].

The big thing about word lists .dic files, is the fact that is is just a 
list of properly spelled words.  Many of the words in the file have 
reference codes that seems to be used to reference spelling definitions, 
such as suffix replacements.  I worked on a dictionary project many 
years ago and that is the way it seemed to be done, since it takes much 
file space to do it that way.  I had to write a test program for that 
list and codes, since that was not available to the people working on 
the list creation.  It seemed that the Asian group creating the spell 
checker for the "electronic" typewriter did not want to share the code 
for that so that the team I worked with could make sure the work was 
done correctly.  I was the support person for that team at the college 
where the work was being done.  This was when PCs were pre-Windows machines.

dictatorship/MS
diction/SVM
dictionary/MS
dictum/M
did/4612
didact/MS

The April 2011 .dic file has about 51,000 words in it.
I have a word list, not in the .dic format, with over 213,000 word in 
it.  That list is missing some words.  "dictionary" was not in that 
213,000 word list for some reason.
-------------------


Also attached to this email are two other new dictionary files that I do 
not remember seeing in the OOo web site.

         Brazilian--dictionary-thesaurus--Dicionário de Sinônimos 
Protuguês Brasil____2010-05-28.oxt
         Scottish Gaelic Dictionary--gd_GB____2009-02-10.oxt

Both of these have not been added to the site.  Here is the list of the 
Brazilian-Portuguese and Scottish-Gaelic dictionaries on the list.

         Portuguese (Brazilian) - Vero - Brazilian Portuguese 
Spellchecking Dictionary & Hyphenator
                                         - Brazilian Portuguese grammar 
checker 3.1.0 2010-11-17

         Scots Gaelic     Gàidhlig     An Dearbhair-litreachaidh Beag - 
Spelling dictionary for Scottish Gaelic

-------------------------------------------------------

It seems that the creators of the .dic files are not often the same 
people who create the .oxt dictionaries for "us" to use.  Many of these 
same dictionary lists are also used for programs like Thunderbird, but 
in a different list of other "archived" files besides the .dic files.




-- 
Unsubscribe instructions: E-mail to projects+h...@global.libreoffice.org
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/projects/
All messages sent to this list will be publicly archived and cannot be deleted

Reply via email to