Re: [lingu-dev] unmunch separator

Kevin B. Hendricks Tue, 10 Apr 2007 11:44:45 -0700

Hi,

Please remember than unmunch does not guarantee a one-to-one mappingbetween words and root forms. For example, an unmunched word may begenerated by many different root words and affixes and not just once.

That is why the unmunched list of words is typically uniquely sortedto remove duplicates.

The basic idea is that a raw word list when compressed by affixcompression (munch) will always expand (unmunch) to exactly the sameraw word list after sorting uniquely with no additions or deletions.


FWIW,

Kevin


On Apr 10, 2007, at 2:31 PM, Oleg Burlaca wrote:

Jancs wrote:

i suppose, you have to edit unmuch source to get such option.

Janis

Yes Jancs, you was write, I've modified the /src/tools/unmunch.cfile from the hunspell package.

Just added a line:
  fprintf(stdout, "%s\n", "---");
after the block that writes out wordforms:
   for (i=0; i < numwords; i++) {
     fprintf(stdout,"%s\n",wlist[i].word);
     free(wlist[i].word);
     wlist[i].word = NULL;
     wlist[i].pallow = 0;
   }


It was easier than I thought :))
Thanks.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]

For additional commands, e-mail: dev-[EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [lingu-dev] unmunch separator

Reply via email to