Github user helenahm commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/93
  
    In Hivemall-126: 
    
    Max Entropy Classifier (a.k.a. Multi-nominal/Multiclass Logistic 
Regression) [1,2] is useful for Text classification.
    
    Max Entropy Classifier is more often used for Part-of-Speech Tagging and 
Named Entity Recognition, and some other tasks where context is used as 
features. Those are also fundamental tasks of NLP. Even though Text 
Classification is a candidate too.
    
    Mohri as his colleagues also put POS task first. As Mohri writes in the 
article that is a basis for the implementation I have chosen:
    
    Our first set of experiments were carried out with “medium” scale data 
sets containing 1M-300Minstances.
    These included: English part-of-speech tagging, generated from the Penn 
Treebank
    [16] using the first character of each part-of-speech tag as output, 
sections 2-21 for training, section
    23 for testing and a feature representation based on the identity, affixes, 
and orthography of the input
    word and the words in a window of size two; Sentiment analysis, generated 
from a set of
    online product, service, and merchant reviews with a three-label output 
(positive, negative, neutral),
    with a bag of words feature representation; RCV1-v2 as described by [14], 
where documents having
    multiple labels were included multiple times, once for each label; Acoustic 
Speech Data, a 39-
    dimensional input consisting of 13 PLP coefficients, plus their first and 
second derivatives, and 129
    outputs (43 phones × 3 acoustic states); and the Deja News Archive, a text 
topic classification
    problem generated from a collection of Usenet discussion forums from the 
years 1995-2000. For all
    text experiments, we used random feature mixing [9, 20] to control the size 
of the feature space.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to