The Ngram Statistics Package is mostly intended to help you find the most frequent ngrams in a corpus, or the most strongly associated ngrams in a corpus. It doesn't necessarily directly give you informativeness, although you can certainly come up with ways to use frequency and measures of association to find that. It sounds like you should look at our paper on NSP to get some ideas about how to use it, and what it offers.
http://www.d.umn.edu/~tpederse/Pubs/cicling2003-2.pdf Also, the code itself has some documentation that should be helpful... http://search.cpan.org/~tpederse/Text-NSP/doc/README.pod http://search.cpan.org/~tpederse/Text-NSP/doc/USAGE.pod I hope this helps! Ted On Tue, May 10, 2016 at 5:22 AM, 'Amir H. Jadidinejad' amir.jad...@yahoo.com [ngram] <ngram@yahoogroups.com> wrote: > > > Hi, > > I have a corpus of 3K short text documents. I’m going to *recognize the > most informative n-grams* in the corpus. > Unfortunately, I can’t find a straight way from the documents. Would you > please help me? > > Kind regards, > Amir H. Jadidinejad > > >