Meetu Patel created YARN-11242:
----------------------------------

             Summary: New Map Reduce Example - Simple Sentiment Analysis
                 Key: YARN-11242
                 URL: https://issues.apache.org/jira/browse/YARN-11242
             Project: Hadoop YARN
          Issue Type: Improvement
    Affects Versions: 3.4.0
            Reporter: Meetu Patel
             Fix For: 3.4.0
         Attachments: sample_data.txt, sample_words.txt

I am looking to add a new map reduce example, i.e, sentiment analysis. 
Sentiment analysis map reduce job helps in determining the sentiment score for 
a user. It takes each tweet made by an user and assigns a sentiment score for 
that tweet/sentence for a particular user and then aggregates the sentiment 
scores for all tweets made by all users.

This example takes the twitter dataset which contains users and the tweets made 
by users and gives the output as <username, sentiment score>. For each user, 
the sentiment score is calculated for all the tweets made by that particular 
user.

This mapreduce examples takes in two input files - input twitter dataset and a 
file containing list of words.
The word list file contains positive, negative and negation words which are 
used to give a sentiment score to the words in tweets.

You can use command:
bin/hadoop jar /HADOOP_PATH/share/hadoop/mapreduce/mapreduce-examples.jar 
sentimentanalysis <input file/dir path> <output dir path> <word list file 
path/dir path>

For example, you can use the sample files and run the above command as:
bin/hadoop jar /HADOOP_PATH/share/hadoop/mapreduce/mapreduce-examples.jar 
sentimentanalysis sample_data.txt <output dir path> sample_words.txt



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Reply via email to