On 10:35 Sun 12 Mar , Laurent Godard wrote: > Hi all > > I plan to start a work (with a linguistic student) on an affix file > generator for OOo ... > > My qusstions are: > - does something already exists ?
Hi Laurent, John Goldsmith from Chicago has written something like this called "Linguistica": http://linguistica.uchicago.edu/ I've tried it out and it does a reasonable job - you might start by having a look at it and seeing if you can massage its output into an affix file. You probably recall that I have web-crawled corpora and hence frequency lists for 200+ languages as part of the gramadoir project - if you get something up and running I can do some testing with these. Note that the important thing here (in my view) is to get something *linguistically* meaningful - if the goal is to merely compress the word list one can just run munchlist to find candidate affixes. The real advantage of a good affix file is that once it exists one can use it to extract candidate word/affix pairs from a corpus automatically - I have code for this already (one level of affixes only for now). So obviously I'll be thrilled if you get something good going. -Kevin --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
