Github user helenahm commented on the issue:
https://github.com/apache/incubator-hivemall/pull/93
In Hivemall-126:
Max Entropy Classifier (a.k.a. Multi-nominal/Multiclass Logistic
Regression) [1,2] is useful for Text classification.
Max Entropy Classifier is more often used for Part-of-Speech Tagging and
Named Entity Recognition, and some other tasks where context is used as
features. Those are also fundamental tasks of NLP. Even though Text
Classification is a candidate too.
Mohri as his colleagues also put POS task first. As Mohri writes in the
article that is a basis for the implementation I have chosen:
Our first set of experiments were carried out with âmediumâ scale data
sets containing 1M-300Minstances.
These included: English part-of-speech tagging, generated from the Penn
Treebank
[16] using the first character of each part-of-speech tag as output,
sections 2-21 for training, section
23 for testing and a feature representation based on the identity, affixes,
and orthography of the input
word and the words in a window of size two; Sentiment analysis, generated
from a set of
online product, service, and merchant reviews with a three-label output
(positive, negative, neutral),
with a bag of words feature representation; RCV1-v2 as described by [14],
where documents having
multiple labels were included multiple times, once for each label; Acoustic
Speech Data, a 39-
dimensional input consisting of 13 PLP coefficients, plus their first and
second derivatives, and 129
outputs (43 phones à 3 acoustic states); and the Deja News Archive, a text
topic classification
problem generated from a collection of Usenet discussion forums from the
years 1995-2000. For all
text experiments, we used random feature mixing [9, 20] to control the size
of the feature space.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---