On 10:35 Sun 12 Mar     , Laurent Godard wrote:
> Hi all
> 
> I plan to start a work (with a linguistic student) on an affix file 
> generator for OOo
...
> 
> My qusstions are:
> - does something already exists ?

Hi Laurent,

   John Goldsmith from Chicago has written something like
 this called "Linguistica":

 http://linguistica.uchicago.edu/

 I've tried it out and it does a reasonable job - you might start 
 by having a look at it and seeing if you can massage its output
 into an affix file.  

 You probably recall that I have web-crawled corpora and hence
 frequency lists for 200+ languages as part of the gramadoir project -
 if you get something up and running I can do some testing with these.

 Note that the important thing here (in my view) is to get
 something *linguistically* meaningful - if the goal is to merely
 compress the word list one can just run munchlist to find candidate
 affixes.   

 The real advantage of a good affix file is that once it exists one can use
 it to extract candidate word/affix pairs from a corpus automatically -
 I have code for this already (one level of affixes only for now).  So
 obviously I'll be thrilled if you get something good going.

-Kevin

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to