[EMAIL PROTECTED] wrote:
> 
> Kir,
> 
> Could you please explain format of langmap file?

langmap file contains patterns that are generated 
by statistical analysis of the big text. Charset
guesser functionality in aspseek is based on...
well sorry I can not remember or find that program,
I just remember it was a simple perl library with
online demo.

Wow! I have finally found it:
http://odur.let.rug.nl/~vannoord/TextCat/

You can generate your own langmap files using lmgen
utility from http://www.aspseek.org/contrib.html

> Also, how does it work with charsets.conf?

Use of non-unicode in latest 1.2.x versions is deprecated,
we do not longer support it. Also, non-unicode will probably
be removed from 1.3.x and future versions. You should use
ucharset.conf, for more details see
http://www.aspseek.org/man/aspseek.conf.5.html#lbAO

-- [EMAIL PROTECTED] ICQ UIN 7551596 Phone +7 903 6722750 --
   Guinness a Day Keeps a Doctor Away (people's wisdom)

Reply via email to