On Wed, Jul 14, 2010 at 10:06 AM, Caolán McNamara <[email protected]> wrote:
> On Wed, 2010-07-14 at 10:00 -0500, Peng Yu wrote:
>> I realize that it may be better to send the questions to the dev
>> mailing list. So do I. Thank you!
>>
>>
>> ---------- Forwarded message ----------
>> From: Peng Yu <[email protected]>
>> Date: Wed, Jul 14, 2010 at 9:54 AM
>> Subject: Is there a dictionary in the format of text file?
>> To: [email protected]
>>
>>
>> Hi,
>>
>> I'd like to have a local copy of all the English words (including the
>> variants, like, plural form, -ing form, -s form). Since there is spell
>> check in OO, there might be a dictionary file available somewhere.
>> Would you please let me where it is?
>
> In theory, those there are bugs in it unfortunately, hunspell's
> "unmunch" is supposed to do this, i.e. unzip the dict-en.oxt (or
> whatever is its exact name) and use unmunch en.dic en.aff and it should
> stick the -ing, and -s etc onto all the stem words and expand the
> dictionary+aff files back into a text file containing all variants.

I generated the expanded dictionary. The stderr output is irrelevant
to me, right? What do you by "in theory"? What bugs are you referring
to?

$ unmunch en_US.dic en_US.aff  > en_US.txt
parsing line: SET ISO8859-1
parsing line: TRY esianrtolcdugmphbyfvkwzESIANRTOLCDUGMPHBYFVKWZ'
parsing line: NOSUGGEST !
parsing line:
parsing line: # ordinal numbers
parsing line: COMPOUNDMIN 1
parsing line: # only in compounds: 1th, 2th, 3th
parsing line: ONLYINCOMPOUND c
parsing line: # compound rules:
parsing line: # 1. [0-9]*1[0-9]th (10th, 11th, 12th, 56714th, etc.)
parsing line: # 2. [0-9]*[02-9](1st|2nd|3rd|[4-9]th) (21st, 22nd,
123rd, 1234th, etc.)
parsing line: COMPOUNDRULE 2
parsing line: COMPOUNDRULE n*1t
parsing line: COMPOUNDRULE n*mp
parsing line: WORDCHARS 0123456789
parsing line:
parsing line: PFX A Y 1
parsing A entries 1
   affix: re 2, strip:  0
ptable 0 num is 1 flag A
parsing line:
parsing line: PFX I Y 1
parsing I entries 1
   affix: in 2, strip:  0
ptable 1 num is 1 flag I
parsing line:
parsing line: PFX U Y 1
parsing U entries 1
   affix: un 2, strip:  0
ptable 2 num is 1 flag U
parsing line:
parsing line: PFX C Y 1
parsing C entries 1
   affix: de 2, strip:  0
ptable 3 num is 1 flag C
parsing line:
parsing line: PFX E Y 1
parsing E entries 1
   affix: dis 3, strip:  0
ptable 4 num is 1 flag E
parsing line:
parsing line: PFX F Y 1
parsing F entries 1
   affix: con 3, strip:  0
ptable 5 num is 1 flag F
parsing line:
parsing line: PFX K Y 1
parsing K entries 1
   affix: pro 3, strip:  0
ptable 6 num is 1 flag K
parsing line:
parsing line: SFX V N 2
parsing V entries 2
   affix: ive 3, strip: e 1
   affix: ive 3, strip:  0
stable 0 num is 2 flag V
parsing line:
parsing line: SFX N Y 3
parsing N entries 3
   affix: ion 3, strip: e 1
   affix: ication 7, strip: y 1
   affix: en 2, strip:  0
stable 1 num is 3 flag N
parsing line:
parsing line: SFX X Y 3
parsing X entries 3
   affix: ions 4, strip: e 1
   affix: ications 8, strip: y 1
   affix: ens 3, strip:  0
stable 2 num is 3 flag X
parsing line:
parsing line: SFX H N 2
parsing H entries 2
   affix: ieth 4, strip: y 1
   affix: th 2, strip:  0
stable 3 num is 2 flag H
parsing line:
parsing line: SFX Y Y 1
parsing Y entries 1
   affix: ly 2, strip:  0
stable 4 num is 1 flag Y
parsing line:
parsing line: SFX G Y 2
parsing G entries 2
   affix: ing 3, strip: e 1
   affix: ing 3, strip:  0
stable 5 num is 2 flag G
parsing line:
parsing line: SFX J Y 2
parsing J entries 2
   affix: ings 4, strip: e 1
   affix: ings 4, strip:  0
stable 6 num is 2 flag J
parsing line:
parsing line: SFX D Y 4
parsing D entries 4
   affix: d 1, strip:  0
   affix: ied 3, strip: y 1
   affix: ed 2, strip:  0
   affix: ed 2, strip:  0
stable 7 num is 4 flag D
parsing line:
parsing line: SFX T N 4
parsing T entries 4
   affix: st 2, strip:  0
   affix: iest 4, strip: y 1
   affix: est 3, strip:  0
   affix: est 3, strip:  0
stable 8 num is 4 flag T
parsing line:
parsing line: SFX R Y 4
parsing R entries 4
   affix: r 1, strip:  0
   affix: ier 3, strip: y 1
   affix: er 2, strip:  0
   affix: er 2, strip:  0
stable 9 num is 4 flag R
parsing line:
parsing line: SFX Z Y 4
parsing Z entries 4
   affix: rs 2, strip:  0
   affix: iers 4, strip: y 1
   affix: ers 3, strip:  0
   affix: ers 3, strip:  0
stable 10 num is 4 flag Z
parsing line:
parsing line: SFX S Y 4
parsing S entries 4
   affix: ies 3, strip: y 1
   affix: s 1, strip:  0
   affix: es 2, strip:  0
   affix: s 1, strip:  0
stable 11 num is 4 flag S
parsing line:
parsing line: SFX P Y 3
parsing P entries 3
   affix: iness 5, strip: y 1
   affix: ness 4, strip:  0
   affix: ness 4, strip:  0
stable 12 num is 3 flag P
parsing line:
parsing line: SFX M Y 1
parsing M entries 1
   affix: 's 2, strip:  0
stable 13 num is 1 flag M
parsing line:
parsing line: SFX B Y 3
parsing B entries 3
   affix: able 4, strip:  0
   affix: able 4, strip:  0
   affix: able 4, strip: e 1
stable 14 num is 3 flag B
parsing line:
parsing line: SFX L Y 1
parsing L entries 1
   affix: ment 4, strip:  0
stable 15 num is 1 flag L
parsing line:
parsing line: REP 88
parsing line: REP a ei
parsing line: REP ei a
parsing line: REP a ey
parsing line: REP ey a
parsing line: REP ai ie
parsing line: REP ie ai
parsing line: REP are air
parsing line: REP are ear
parsing line: REP are eir
parsing line: REP air are
parsing line: REP air ere
parsing line: REP ere air
parsing line: REP ere ear
parsing line: REP ere eir
parsing line: REP ear are
parsing line: REP ear air
parsing line: REP ear ere
parsing line: REP eir are
parsing line: REP eir ere
parsing line: REP ch te
parsing line: REP te ch
parsing line: REP ch ti
parsing line: REP ti ch
parsing line: REP ch tu
parsing line: REP tu ch
parsing line: REP ch s
parsing line: REP s ch
parsing line: REP ch k
parsing line: REP k ch
parsing line: REP f ph
parsing line: REP ph f
parsing line: REP gh f
parsing line: REP f gh
parsing line: REP i igh
parsing line: REP igh i
parsing line: REP i uy
parsing line: REP uy i
parsing line: REP i ee
parsing line: REP ee i
parsing line: REP j di
parsing line: REP di j
parsing line: REP j gg
parsing line: REP gg j
parsing line: REP j ge
parsing line: REP ge j
parsing line: REP s ti
parsing line: REP ti s
parsing line: REP s ci
parsing line: REP ci s
parsing line: REP k cc
parsing line: REP cc k
parsing line: REP k qu
parsing line: REP qu k
parsing line: REP kw qu
parsing line: REP o eau
parsing line: REP eau o
parsing line: REP o ew
parsing line: REP ew o
parsing line: REP oo ew
parsing line: REP ew oo
parsing line: REP ew ui
parsing line: REP ui ew
parsing line: REP oo ui
parsing line: REP ui oo
parsing line: REP ew u
parsing line: REP u ew
parsing line: REP oo u
parsing line: REP u oo
parsing line: REP u oe
parsing line: REP oe u
parsing line: REP u ieu
parsing line: REP ieu u
parsing line: REP ue ew
parsing line: REP ew ue
parsing line: REP uff ough
parsing line: REP oo ieu
parsing line: REP ieu oo
parsing line: REP ier ear
parsing line: REP ear ier
parsing line: REP ear air
parsing line: REP air ear
parsing line: REP w qu
parsing line: REP qu w
parsing line: REP z ss
parsing line: REP ss z
parsing line: REP shun tion
parsing line: REP shun sion
parsing line: REP shun cion
parsed in 7 prefixes and 16 suffixes



-- 
Regards,
Peng

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to