Re: [GitHub] opennlp pull request #238: Revert merging of sentiment work, no consent to m...

Joern Kottmann Thu, 29 Jun 2017 07:37:27 -0700

Which data sets did you use to evaluate this?
I was looking for a bit more than a sample file to train it.


I noticed that you checked in stanford and netflix models.

The stanford data set is probably this one:
http://ai.stanford.edu/~amaas/data/sentiment/

Do you have a link for the netflix data?

Jörn

On Thu, Jun 29, 2017 at 4:00 PM, Chris Mattmann <[email protected]> wrote:
> Absolutely you can find it here:
>
> opennlp-tools/src/test/resources/opennlp/tools/sentiment/sample_train_categ 
> (for categorical /multi-class)
> opennlp-tools/src/test/resources/opennlp/tools/sentiment/sample_train_categ2 
> (for categorical/multi-class)
>
> We can also do similar files where instead of multi-class, we just use 
> pos/neg as the label.
>
> Cheers,
> Chris
>
>
>
>
>
> On 6/29/17, 2:35 AM, "Joern Kottmann" <[email protected]> wrote:
>
>     Hello Chris,
>
>     could you please point me to files I can use to train the sentiment
>     component? I am currently looking again through the code and would
>     like to train it myself.
>
>     Jörn
>
>     On Tue, Jun 27, 2017 at 4:59 PM, Dan Russ <[email protected]> wrote:
>     > Hi All,
>     >    First, let me take a share of blame for the comment Chris mentioned. 
>  I believe I said something like the pull request was X revision behind and Y 
> revisions ahead.  It was not meant to be rude, it was meant to say it is hard 
> to review code when it is so different from the current code base. I am very 
> excited that sentiment analysis is going to be added to OpenNLP, but I have 
> not had time to play with it. If I were to say “great job” before I have add 
> a chance to look at it, it would be flattery not honest praise.
>     >
>     >   Let’s clean up the merge.  I agree with Chris that scalability and 
> perfection should not be our initial goal.  Let’s get something, and we can 
> decide how to optimize later (even if it require a complete rewrite).  
> Perfection is the enemy of the good.
>     >
>     >   Finally, because of Chris’ comments it is hard to thank Ana and Chris 
> without sounding insincere.  But I’ll try, thank you Chris and Ana.  I hope 
> we can get beyond this and that Chris and Ana will continue to improve the 
> performance of the sentiment analysis tool and happily remain part of the 
> OpenNLP family.  It is also a good time to toss a big thank you to all of the 
> committers, users, and PMC member.  I use OpenNLP almost everyday.  Your work 
> is extremely valuable to me.
>     >
>     > Thank you,
>     > Daniel
>     >
>     >> On Jun 27, 2017, at 10:25 AM, Chris Mattmann <[email protected]> 
> wrote:
>     >>
>     >> Hi everyone,
>     >>
>     >> I spoke with Joern in Slack. Some of his concerns are:
>     >>
>     >> 1. This was done with a Merge commit and apparently they squash and 
> rebase.
>     >> [would be helpful to see some pointer on this for documentation, thus 
> far I
>     >> haven’t found any]
>     >> 2. Apparently we literally need to ask others for +1 votes and record 
> them
>     >> before committing? I thought since Ana and I are committers aren were 
> +1,
>     >> and since Joern had been providing feedback (the last of which was to 
> add
>     >> tests, which we did) that he would be +1 as well (I guess he is not, 
> and I guess
>     >> formally we need to do a +1 vote even still)
>     >> 3. There was concern about scalability of the code.
>     >> 4. There are thoughts that the code was not perfect yet (even though 
> it works
>     >> fine in the MEMEX project for Ana and I)
>     >>
>     >> So, Joern has opened up a revert PR.
>     >>
>     >> I suppose I should state I find this process extremely heavyweight and 
> unwelcoming.
>     >> To me, there should be a modicum of trust for committers, but I feel 
> like even as a
>     >> committer, I am operating as a “contributor” to the project. Committer 
> means that
>     >> there is trust to modify the source code base. Of the issues above, 
> the only one I see
>     >> as a moderate snafu was #1, and frankly if there are some instructions 
> that show me
>     >> how to do squashing and rebasing *first* I will try to do that in the 
> future since I am
>     >> not a GIt expert.
>     >>
>     >> That said, I must state I feel pretty put off by Apache OpenNLP. This 
> originated as a GSoC
>     >> effort, and we have worked pretty consistently on this over the last 
> year. We used a
>     >> separate GitHub project to get started, kept Joern involved as another 
> mentor, even
>     >> provided access and commit writes to that GitHub repository for a long 
> time, so this
>     >> code was developed in the open. Joern even created a branch in 
> ApacheOpenNLP in the code and I suppose
>     >> I should have gone and worked on that branch first since master is 
> apparently so
>     >> pristine that even an Apache veteran like me can’t get something in to 
> it without
>     >> making a whole bunch of (what are IMO minor issues, and what are IMO 
> heavyweight
>     >> “community” issues).
>     >>
>     >> I am concerned from a community point of view that the first comment 
> wasn’t “Great
>     >> job Chris, you got Sentiment Analysis into Apache, *but* I have these 
> concerns 1-4 above”.
>     >> It was “The PR was merged wrong in ways 1-4 and I’m going to revert 
> it.”
>     >>
>     >> That’s pretty off-putting to someone who is semi-new like me and like 
> Ana.
>     >>
>     >> Anyways, go ahead and revert it. Sorry to have caused any issues.
>     >>
>     >> Chris
>     >>
>     >>
>     >>
>     >> On 6/27/17, 7:06 AM, "Chris Mattmann" <[email protected]> wrote:
>     >>
>     >>    Hi Joern,
>     >>
>     >>    I’m confused. Why did you revert my commit?
>     >>
>     >>    Every one of those check points you put on the PR was checked?
>     >>    We have been discussing this for months, you have seen the
>     >>    code for months, Ana and I have worked diligently on the code
>     >>    in plain view of everyone.
>     >>
>     >>    Please explain.
>     >>
>     >>    Chris
>     >>
>     >>
>     >>
>     >>
>     >>    On 6/27/17, 1:23 AM, "kottmann" <[email protected]> wrote:
>     >>
>     >>        GitHub user kottmann opened a pull request:
>     >>
>     >>            https://github.com/apache/opennlp/pull/238
>     >>
>     >>            Revert merging of sentiment work, no consent to merge it
>     >>
>     >>            Thank you for contributing to Apache OpenNLP.
>     >>
>     >>            In order to streamline the review of the contribution we 
> ask you
>     >>            to ensure the following steps have been taken:
>     >>
>     >>            ### For all changes:
>     >>            - [ ] Is there a JIRA ticket associated with this PR? Is it 
> referenced
>     >>                 in the commit message?
>     >>
>     >>            - [ ] Does your PR title start with OPENNLP-XXXX where XXXX 
> is the JIRA number you are trying to resolve? Pay particular attention to the 
> hyphen "-" character.
>     >>
>     >>            - [ ] Has your PR been rebased against the latest commit 
> within the target branch (typically master)?
>     >>
>     >>            - [ ] Is your initial contribution a single, squashed 
> commit?
>     >>
>     >>            ### For code changes:
>     >>            - [ ] Have you ensured that the full suite of tests is 
> executed via mvn clean install at the root opennlp folder?
>     >>            - [ ] Have you written or updated unit tests to verify your 
> changes?
>     >>            - [ ] If adding new dependencies to the code, are these 
> dependencies licensed in a way that is compatible for inclusion under [ASF 
> 2.0](http://www.apache.org/legal/resolved.html#category-a)?
>     >>            - [ ] If applicable, have you updated the LICENSE file, 
> including the main LICENSE file in opennlp folder?
>     >>            - [ ] If applicable, have you updated the NOTICE file, 
> including the main NOTICE file found in opennlp folder?
>     >>
>     >>            ### For documentation related changes:
>     >>            - [ ] Have you ensured that format looks appropriate for 
> the output in which it is rendered?
>     >>
>     >>            ### Note:
>     >>            Please ensure that once the PR is submitted, you check 
> travis-ci for build issues and submit an update to your PR as soon as 
> possible.
>     >>
>     >>
>     >>        You can merge this pull request into a Git repository by 
> running:
>     >>
>     >>            $ git pull https://github.com/kottmann/opennlp 
> revert_sentiment
>     >>
>     >>        Alternatively you can review and apply these changes as the 
> patch at:
>     >>
>     >>            https://github.com/apache/opennlp/pull/238.patch
>     >>
>     >>        To close this pull request, make a commit to your master/trunk 
> branch
>     >>        with (at least) the following in the commit message:
>     >>
>     >>            This closes #238
>     >>
>     >>        ----
>     >>        commit 123222eb34724bae793e9d6d22e202c0aee0aa45
>     >>        Author: JÃ¶rn Kottmann <[email protected]>
>     >>        Date:   2017-06-27T08:19:19Z
>     >>
>     >>            Revert merging of sentiment work, no consent to merge it
>     >>
>     >>        ----
>     >>
>     >>
>     >>        ---
>     >>        If your project is set up for it, you can reply to this email 
> and have your
>     >>        reply appear on GitHub as well. If your project does not have 
> this feature
>     >>        enabled and wishes so, or if the feature is enabled but not 
> working, please
>     >>        contact infrastructure at [email protected] or file a 
> JIRA ticket
>     >>        with INFRA.
>     >>        ---
>     >>
>     >>
>     >>
>     >>
>     >>
>     >>
>     >
>
>
>

Re: [GitHub] opennlp pull request #238: Revert merging of sentiment work, no consent to m...

Reply via email to