Re: Tika 2.0 - Replace POI IOUtils with commons-io IOUtils

2016-03-27 Thread Bob Paulin
Tika's IOUtils appears to be missing the readFully method. Should that be added? - Bob On 3/27/2016 6:52 PM, Nick Burch wrote: On Sun, 27 Mar 2016, Bob Paulin wrote: Currently the Apache POI dependency is in several modules and it's sort of a beast (> 2 MB in size). You should've seen it b

Re: Tika 2.0 - Replace POI IOUtils with commons-io IOUtils

2016-03-27 Thread Bob Paulin
Hi Nick, On 3/27/2016 6:52 PM, Nick Burch wrote: On Sun, 27 Mar 2016, Bob Paulin wrote: Currently the Apache POI dependency is in several modules and it's sort of a beast (> 2 MB in size). You should've seen it before Jukka and Yegor spent a crazy ApacheCon hacking up the ooxml-lite support.

Re: Tika 2.0 - Replace POI IOUtils with commons-io IOUtils

2016-03-27 Thread Nick Burch
On Sun, 27 Mar 2016, Bob Paulin wrote: Currently the Apache POI dependency is in several modules and it's sort of a beast (> 2 MB in size). You should've seen it before Jukka and Yegor spent a crazy ApacheCon hacking up the ooxml-lite support... ;-) It appears many of the modules are only u

Re: GSOC2016 Sentiment Analysis

2016-03-27 Thread Mattmann, Chris A (3980)
Nishant, I’m not sure what you are talking about, at all. It’s part of the engagement process in GSoC to *engage the community*. At Apache this is done on list. I’ve been on this list for months and there is about 0..traffic. Which is not good. Traffic, like this, *is good*. It shows there is a h

Re: GSOC2016 Sentiment Analysis

2016-03-27 Thread Nishant Kelkar
Hi Madhawa, Could you take this discussion off the dev openNLP list for other problems concerning logging in, participation, etc. now that you have a positive response? In my humble opinion, that would prevent others not involved in your discussion from getting email about the topic. Good luck!

Re: Tika 2.0 - Replace POI IOUtils with commons-io IOUtils

2016-03-27 Thread Bob Paulin
There is also org.apache.poi.util.StringUtil (in cad module) and org.apache.poi.util.LittleEndian (in code module) Neither of these seem to have commons libraries replacements from what I can see. Given the small amount of code in the methods that are actually used would it make sense to mov

Tika 2.0 - Replace POI IOUtils with commons-io IOUtils

2016-03-27 Thread Bob Paulin
Hi, Currently the Apache POI dependency is in several modules and it's sort of a beast (> 2 MB in size). It appears many of the modules are only using the IOUtils library. The big exception is the office module which is responsible for parsing documents. These methods appear to also exist

Re: GSOC2016 Sentiment Analysis

2016-03-27 Thread Mattmann, Chris A (3980)
Thanks please can you create a username with no spaces? Sent from my iPhone On Mar 27, 2016, at 2:20 AM, Madhawa Kasun Gunasekara mailto:madhaw...@gmail.com>> wrote: Hi Chris, Thanks for the reply, I tried to logging to [1], but I couldn't able to login into that my username is "Madhawa Gunas

[jira] [Updated] (TIKA-1911) Sentiment Analysis

2016-03-27 Thread Madhawa Gunasekara (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Madhawa Gunasekara updated TIKA-1911: - Labels: GSoC2016 irds memex (was: GSoC2016) > Sentiment Analysis > -- > >

[jira] [Updated] (TIKA-1911) OpenNLP based SentimentAnalysisParser

2016-03-27 Thread Madhawa Gunasekara (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Madhawa Gunasekara updated TIKA-1911: - Summary: OpenNLP based SentimentAnalysisParser (was: Sentiment Analysis) > OpenNLP based S

[jira] [Created] (TIKA-1911) Sentiment Analysis

2016-03-27 Thread Madhawa Gunasekara (JIRA)
Madhawa Gunasekara created TIKA-1911: Summary: Sentiment Analysis Key: TIKA-1911 URL: https://issues.apache.org/jira/browse/TIKA-1911 Project: Tika Issue Type: Improvement Rep