[ https://issues.apache.org/jira/browse/TIKA-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995572#comment-15995572 ]
ASF GitHub Bot commented on TIKA-2016: -------------------------------------- chrismattmann commented on issue #169: TIKA-2016 Sentiment Analysis Parser Contributed by amensiko and thammegowda URL: https://github.com/apache/tika/pull/169#issuecomment-299022223 ## Build passes: ``` [INFO] ------------------------------------------------------------------------ [INFO] Building Apache Tika 1.15-SNAPSHOT [INFO] ------------------------------------------------------------------------ [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ tika --- [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ tika --- [INFO] [INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ tika --- [INFO] [INFO] --- forbiddenapis:2.2:check (default) @ tika --- [INFO] Skipping execution for packaging "pom" [INFO] [INFO] --- forbiddenapis:2.2:testCheck (default) @ tika --- [INFO] Skipping execution for packaging "pom" [INFO] [INFO] --- maven-install-plugin:2.5.2:install (default-install) @ tika --- [INFO] Installing /Users/mattmann/git/tika-gh/pom.xml to /Users/mattmann/.m2/repository/org/apache/tika/tika/1.15-SNAPSHOT/tika-1.15-SNAPSHOT.pom [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] [INFO] Apache Tika parent ................................. SUCCESS [ 1.047 s] [INFO] Apache Tika core ................................... SUCCESS [ 22.697 s] [INFO] Apache Tika parsers ................................ SUCCESS [03:23 min] [INFO] Apache Tika XMP .................................... SUCCESS [ 1.649 s] [INFO] Apache Tika serialization .......................... SUCCESS [ 1.436 s] [INFO] Apache Tika batch .................................. SUCCESS [01:50 min] [INFO] Apache Tika language detection ..................... SUCCESS [ 3.730 s] [INFO] Apache Tika application ............................ SUCCESS [ 32.442 s] [INFO] Apache Tika OSGi bundle ............................ SUCCESS [ 17.231 s] [INFO] Apache Tika translate .............................. SUCCESS [ 1.825 s] [INFO] Apache Tika server ................................. SUCCESS [ 35.426 s] [INFO] Apache Tika examples ............................... SUCCESS [ 9.609 s] [INFO] Apache Tika Java-7 Components ...................... SUCCESS [ 1.738 s] [INFO] Apache Tika eval ................................... SUCCESS [ 24.480 s] [INFO] Apache Tika ........................................ SUCCESS [ 0.022 s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 07:48 min [INFO] Finished at: 2017-05-03T13:03:29-07:00 [INFO] Final Memory: 164M/1530M [INFO] ------------------------------------------------------------------------ LMC-053601:tika-gh mattmann$ ``` I also tried it myself on the following file: `sample.sent` ``` Man I'm so tired of battling against OSGI! ``` `sample2.sent` ``` Whatever, I need some cooling off time! ``` # Binary sentiment ``` LMC-053601:tika-gh mattmann$ java -jar tika-app/target/tika-app-1.15-SNAPSHOT.jar \ > --config=tika-parsers/src/test/resources/org/apache/tika/parser/sentiment/analysis/tika-config-sentiment-opennlp.xml \ > -m sample.sent WARN JBIG2ImageReader not loaded. jbig2 files will be ignored INFO Sentiment Model is at https://raw.githubusercontent.com/USCDataScience/SentimentAnalysisParser/master/sentiment-models/en-netflix-sentiment.bin Content-Length: 43 Content-Type: application/sentiment Sentiment: negative X-Parsed-By: org.apache.tika.parser.CompositeParser X-Parsed-By: org.apache.tika.parser.sentiment.analysis.SentimentParser resourceName: sample.sent LMC-053601:tika-gh mattmann$ ``` # Categorical (multi-class sentiment) Changing to use `sample2.sent` ``` LMC-053601:tika-gh mattmann$ java -jar tika-app/target/tika-app-1.15-SNAPSHOT.jar --config=tika-parsers/src/test/resources/org/apache/tika/parser/sentiment/analysis/tika-config-sentiment-opennlp-cat.xml -m sample2.sent WARN JBIG2ImageReader not loaded. jbig2 files will be ignored INFO Sentiment Model is at https://raw.githubusercontent.com/USCDataScience/SentimentAnalysisParser/master/sentiment-models/ht-sentiment-categ.bin Content-Length: 39 Content-Type: application/sentiment Sentiment: angry X-Parsed-By: org.apache.tika.parser.CompositeParser X-Parsed-By: org.apache.tika.parser.sentiment.analysis.SentimentParser resourceName: sample2.sent LMC-053601:tika-gh mattmann$ ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > A parser that combines Apache OpenNLP and Apache Tika and provides facilities > for automatically deriving sentiment from text. > ----------------------------------------------------------------------------------------------------------------------------- > > Key: TIKA-2016 > URL: https://issues.apache.org/jira/browse/TIKA-2016 > Project: Tika > Issue Type: New Feature > Components: parser > Reporter: Anastasija Mensikova > Assignee: Chris A. Mattmann > Labels: analysis, gsoc2016, memex, parser, sentiment > Fix For: 1.15 > > > A new project that implements a parser that uses Apache OpenNLP and Apache > Tika to perform Sentiment Analysis. -- This message was sent by Atlassian JIRA (v6.3.15#6346)