On Wed, Jul 14, 2010 at 10:06 AM, Caolán McNamara <[email protected]> wrote: > On Wed, 2010-07-14 at 10:00 -0500, Peng Yu wrote: >> I realize that it may be better to send the questions to the dev >> mailing list. So do I. Thank you! >> >> >> ---------- Forwarded message ---------- >> From: Peng Yu <[email protected]> >> Date: Wed, Jul 14, 2010 at 9:54 AM >> Subject: Is there a dictionary in the format of text file? >> To: [email protected] >> >> >> Hi, >> >> I'd like to have a local copy of all the English words (including the >> variants, like, plural form, -ing form, -s form). Since there is spell >> check in OO, there might be a dictionary file available somewhere. >> Would you please let me where it is? > > In theory, those there are bugs in it unfortunately, hunspell's > "unmunch" is supposed to do this, i.e. unzip the dict-en.oxt (or > whatever is its exact name) and use unmunch en.dic en.aff and it should > stick the -ing, and -s etc onto all the stem words and expand the > dictionary+aff files back into a text file containing all variants.
I generated the expanded dictionary. The stderr output is irrelevant to me, right? What do you by "in theory"? What bugs are you referring to? $ unmunch en_US.dic en_US.aff > en_US.txt parsing line: SET ISO8859-1 parsing line: TRY esianrtolcdugmphbyfvkwzESIANRTOLCDUGMPHBYFVKWZ' parsing line: NOSUGGEST ! parsing line: parsing line: # ordinal numbers parsing line: COMPOUNDMIN 1 parsing line: # only in compounds: 1th, 2th, 3th parsing line: ONLYINCOMPOUND c parsing line: # compound rules: parsing line: # 1. [0-9]*1[0-9]th (10th, 11th, 12th, 56714th, etc.) parsing line: # 2. [0-9]*[02-9](1st|2nd|3rd|[4-9]th) (21st, 22nd, 123rd, 1234th, etc.) parsing line: COMPOUNDRULE 2 parsing line: COMPOUNDRULE n*1t parsing line: COMPOUNDRULE n*mp parsing line: WORDCHARS 0123456789 parsing line: parsing line: PFX A Y 1 parsing A entries 1 affix: re 2, strip: 0 ptable 0 num is 1 flag A parsing line: parsing line: PFX I Y 1 parsing I entries 1 affix: in 2, strip: 0 ptable 1 num is 1 flag I parsing line: parsing line: PFX U Y 1 parsing U entries 1 affix: un 2, strip: 0 ptable 2 num is 1 flag U parsing line: parsing line: PFX C Y 1 parsing C entries 1 affix: de 2, strip: 0 ptable 3 num is 1 flag C parsing line: parsing line: PFX E Y 1 parsing E entries 1 affix: dis 3, strip: 0 ptable 4 num is 1 flag E parsing line: parsing line: PFX F Y 1 parsing F entries 1 affix: con 3, strip: 0 ptable 5 num is 1 flag F parsing line: parsing line: PFX K Y 1 parsing K entries 1 affix: pro 3, strip: 0 ptable 6 num is 1 flag K parsing line: parsing line: SFX V N 2 parsing V entries 2 affix: ive 3, strip: e 1 affix: ive 3, strip: 0 stable 0 num is 2 flag V parsing line: parsing line: SFX N Y 3 parsing N entries 3 affix: ion 3, strip: e 1 affix: ication 7, strip: y 1 affix: en 2, strip: 0 stable 1 num is 3 flag N parsing line: parsing line: SFX X Y 3 parsing X entries 3 affix: ions 4, strip: e 1 affix: ications 8, strip: y 1 affix: ens 3, strip: 0 stable 2 num is 3 flag X parsing line: parsing line: SFX H N 2 parsing H entries 2 affix: ieth 4, strip: y 1 affix: th 2, strip: 0 stable 3 num is 2 flag H parsing line: parsing line: SFX Y Y 1 parsing Y entries 1 affix: ly 2, strip: 0 stable 4 num is 1 flag Y parsing line: parsing line: SFX G Y 2 parsing G entries 2 affix: ing 3, strip: e 1 affix: ing 3, strip: 0 stable 5 num is 2 flag G parsing line: parsing line: SFX J Y 2 parsing J entries 2 affix: ings 4, strip: e 1 affix: ings 4, strip: 0 stable 6 num is 2 flag J parsing line: parsing line: SFX D Y 4 parsing D entries 4 affix: d 1, strip: 0 affix: ied 3, strip: y 1 affix: ed 2, strip: 0 affix: ed 2, strip: 0 stable 7 num is 4 flag D parsing line: parsing line: SFX T N 4 parsing T entries 4 affix: st 2, strip: 0 affix: iest 4, strip: y 1 affix: est 3, strip: 0 affix: est 3, strip: 0 stable 8 num is 4 flag T parsing line: parsing line: SFX R Y 4 parsing R entries 4 affix: r 1, strip: 0 affix: ier 3, strip: y 1 affix: er 2, strip: 0 affix: er 2, strip: 0 stable 9 num is 4 flag R parsing line: parsing line: SFX Z Y 4 parsing Z entries 4 affix: rs 2, strip: 0 affix: iers 4, strip: y 1 affix: ers 3, strip: 0 affix: ers 3, strip: 0 stable 10 num is 4 flag Z parsing line: parsing line: SFX S Y 4 parsing S entries 4 affix: ies 3, strip: y 1 affix: s 1, strip: 0 affix: es 2, strip: 0 affix: s 1, strip: 0 stable 11 num is 4 flag S parsing line: parsing line: SFX P Y 3 parsing P entries 3 affix: iness 5, strip: y 1 affix: ness 4, strip: 0 affix: ness 4, strip: 0 stable 12 num is 3 flag P parsing line: parsing line: SFX M Y 1 parsing M entries 1 affix: 's 2, strip: 0 stable 13 num is 1 flag M parsing line: parsing line: SFX B Y 3 parsing B entries 3 affix: able 4, strip: 0 affix: able 4, strip: 0 affix: able 4, strip: e 1 stable 14 num is 3 flag B parsing line: parsing line: SFX L Y 1 parsing L entries 1 affix: ment 4, strip: 0 stable 15 num is 1 flag L parsing line: parsing line: REP 88 parsing line: REP a ei parsing line: REP ei a parsing line: REP a ey parsing line: REP ey a parsing line: REP ai ie parsing line: REP ie ai parsing line: REP are air parsing line: REP are ear parsing line: REP are eir parsing line: REP air are parsing line: REP air ere parsing line: REP ere air parsing line: REP ere ear parsing line: REP ere eir parsing line: REP ear are parsing line: REP ear air parsing line: REP ear ere parsing line: REP eir are parsing line: REP eir ere parsing line: REP ch te parsing line: REP te ch parsing line: REP ch ti parsing line: REP ti ch parsing line: REP ch tu parsing line: REP tu ch parsing line: REP ch s parsing line: REP s ch parsing line: REP ch k parsing line: REP k ch parsing line: REP f ph parsing line: REP ph f parsing line: REP gh f parsing line: REP f gh parsing line: REP i igh parsing line: REP igh i parsing line: REP i uy parsing line: REP uy i parsing line: REP i ee parsing line: REP ee i parsing line: REP j di parsing line: REP di j parsing line: REP j gg parsing line: REP gg j parsing line: REP j ge parsing line: REP ge j parsing line: REP s ti parsing line: REP ti s parsing line: REP s ci parsing line: REP ci s parsing line: REP k cc parsing line: REP cc k parsing line: REP k qu parsing line: REP qu k parsing line: REP kw qu parsing line: REP o eau parsing line: REP eau o parsing line: REP o ew parsing line: REP ew o parsing line: REP oo ew parsing line: REP ew oo parsing line: REP ew ui parsing line: REP ui ew parsing line: REP oo ui parsing line: REP ui oo parsing line: REP ew u parsing line: REP u ew parsing line: REP oo u parsing line: REP u oo parsing line: REP u oe parsing line: REP oe u parsing line: REP u ieu parsing line: REP ieu u parsing line: REP ue ew parsing line: REP ew ue parsing line: REP uff ough parsing line: REP oo ieu parsing line: REP ieu oo parsing line: REP ier ear parsing line: REP ear ier parsing line: REP ear air parsing line: REP air ear parsing line: REP w qu parsing line: REP qu w parsing line: REP z ss parsing line: REP ss z parsing line: REP shun tion parsing line: REP shun sion parsing line: REP shun cion parsed in 7 prefixes and 16 suffixes -- Regards, Peng --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
