[jira] [Commented] (TIKA-1909) Tika 2.0 - Allow Proxy Parser and Detectors to accept Classloaders

2016-03-25 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212812#comment-15212812 ] Hudson commented on TIKA-1909: -- SUCCESS: Integrated in tika-2.x #60 (See

[jira] [Created] (TIKA-1910) Tika 2.0 - Decouple Tika Parser Office Module from Other Dependencies

2016-03-25 Thread Bob Paulin (JIRA)
Bob Paulin created TIKA-1910: Summary: Tika 2.0 - Decouple Tika Parser Office Module from Other Dependencies Key: TIKA-1910 URL: https://issues.apache.org/jira/browse/TIKA-1910 Project: Tika

[jira] [Resolved] (TIKA-1909) Tika 2.0 - Allow Proxy Parser and Detectors to accept Classloaders

2016-03-25 Thread Bob Paulin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bob Paulin resolved TIKA-1909. -- Resolution: Fixed > Tika 2.0 - Allow Proxy Parser and Detectors to accept Classloaders >

[jira] [Created] (TIKA-1909) Tika 2.0 - Allow Proxy Parser and Detectors to accept Classloaders

2016-03-25 Thread Bob Paulin (JIRA)
Bob Paulin created TIKA-1909: Summary: Tika 2.0 - Allow Proxy Parser and Detectors to accept Classloaders Key: TIKA-1909 URL: https://issues.apache.org/jira/browse/TIKA-1909 Project: Tika Issue

[jira] [Commented] (TIKA-1908) --list-met-models does not display Dublin core along with other metadata models

2016-03-25 Thread Sharmilee S (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212633#comment-15212633 ] Sharmilee S commented on TIKA-1908: --- Can u please explain what is a new style metadata key? >

[jira] [Commented] (TIKA-1908) --list-met-models does not display Dublin core along with other metadata models

2016-03-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212612#comment-15212612 ] Nick Burch commented on TIKA-1908: -- I seem to recall there was a deliberate policy to avoid putting all

Re: GSOC2016 Sentiment Analysis

2016-03-25 Thread Harshavardhan Manjunatha
Dear Prof. Mattmann, Thanks. But the Fisher Callhome Corpus is a training Corpus for Machine Translation b/w Spanish & Englosh. I dont think it can be adapted to Sentiment Analysis. Developing a generic training model/corpus for Sentiment Analysis that encapsulates social media, movie reviews,

Re: GSOC2016 Sentiment Analysis

2016-03-25 Thread Mattmann, Chris A (3980)
Sounds great Harsha. This is for Google Summer of Code, so collaborating would be great, and in this case, we would be working with Madhawa, should he choose to accept. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and

Re: GSOC2016 Sentiment Analysis

2016-03-25 Thread Harshavardhan Manjunatha
Dear Prof. Mattmann, I would love to collaborate on this & am interested in developing Sentiment Analysis Tika Parsers leveraging Apache OpenNLP. I have completed an Applied NLP course @ USC. I have done a Literature Review of Papers & Open Source Tools on the same recently. Regards, Harsha

Re: Change to NER ParserTest re https://builds.apache.org/job/tika-2.x/57

2016-03-25 Thread Mattmann, Chris A (3980)
Hey Tim, I’ll take a look. Would be good to add the @AfterClass for sure though. Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory

[jira] [Commented] (TIKA-774) ExifTool Parser

2016-03-25 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212422#comment-15212422 ] Chris A. Mattmann commented on TIKA-774: Hey [~rgauss] great work. Let's keep them co-existing for

Re: GSOC2016 Sentiment Analysis

2016-03-25 Thread Mattmann, Chris A (3980)
Hi Madhawa, So, how about a project that develops and contributes an Apache Tika and OpenNLP based SentimentAnalysisParser? I have some students currently doing work using the Fisher Callhome Corpus and you can build off that. I am CC’ing my USC IRDS team and my student Indhu who is working on

[jira] [Commented] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2016-03-25 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212052#comment-15212052 ] John Hewson commented on TIKA-1285: --- It would be better to open JIRA issues for problem PDFs so that we

[jira] [Comment Edited] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2016-03-25 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212049#comment-15212049 ] John Hewson edited comment on TIKA-1285 at 3/25/16 4:42 PM: The parser and the

[jira] [Commented] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2016-03-25 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212049#comment-15212049 ] John Hewson commented on TIKA-1285: --- The parser and the rest of PDFBox are tightly coupled, so it's not

[jira] [Updated] (TIKA-1908) --list-met-models does not display Dublin core along with other metadata models

2016-03-25 Thread Sharmilee S (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sharmilee S updated TIKA-1908: -- Summary: --list-met-models does not display Dublin core along with other metadata models (was: