I think we should start a new module, that will be the seed for a subproject, called NLP and that contains the stuff for NLP.
Either that or put them in the utils module, which is where I envision all of things that are "helpful" for ML go, but aren't required. On Jan 16, 2010, at 8:41 AM, Benson Margulies wrote: > I have approval from the CEO to contribute our collection of > abbreviations to Mahout. > > We use them with the ICU breakers. > > I guess IP clearance is called for here, but, thinking ahead, where > would people like to see files of abbreviations in various languages > show up?