Sorry, cocoon v.s. commons.

On Wed, Jan 4, 2012 at 2:24 PM, Lance Norskog <[email protected]> wrote:
> I have a separate solution: strip the quoted text. Quoted text in the
> emails spams the term vectors; just plain TF-IDF is not enough to
> combat this. Lucene has a lot of tools besides TFi-IDF.
>
> I have a patch, gotta start the JIRA. Also added more measurements to
> the confusion matrix. I want to get a good measurement of the
> performance on each producer and consumer, not just a global ratio.
> 'testnb' gives 80% but one of the false boxes has a 1. This is bogus.
> (I'm using your complete corpus of commons v.s. cocoon, classifying
> dev v.s. user.)
>
> On Wed, Jan 4, 2012 at 6:57 AM, Grant Ingersoll (Updated) (JIRA)
> <[email protected]> wrote:
>>
>>     [ 
>> https://issues.apache.org/jira/browse/MAHOUT-939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>>  ]
>>
>> Grant Ingersoll updated MAHOUT-939:
>> -----------------------------------
>>
>>    Attachment: MAHOUT-939.patch
>>
>> Here's a start on this.  Added some more construction options to the 
>> AdaptiveLogisticRegression class.  Still testing what values to use in 
>> TrainASFEmail, but thought I would put this up for now.
>>
>>> ASF Email SGD Examples don't produce good results
>>> -------------------------------------------------
>>>
>>>                 Key: MAHOUT-939
>>>                 URL: https://issues.apache.org/jira/browse/MAHOUT-939
>>>             Project: Mahout
>>>          Issue Type: Bug
>>>    Affects Versions: 0.6
>>>            Reporter: Grant Ingersoll
>>>            Assignee: Grant Ingersoll
>>>              Labels: MAHOUT_INTRO_CONTRIBUTE
>>>             Fix For: 0.7
>>>
>>>         Attachments: MAHOUT-939.patch
>>>
>>>
>>> The SGD examples for the ASF email don't work all that well currently in 
>>> terms of quality.  Also, need to determine how much memory is required for 
>>> vectors of cardinality size 100K.
>>
>> --
>> This message is automatically generated by JIRA.
>> If you think it was sent incorrectly, please contact your JIRA 
>> administrators: 
>> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
>> For more information on JIRA, see: http://www.atlassian.com/software/jira
>>
>>
>
>
>
> --
> Lance Norskog
> [email protected]



-- 
Lance Norskog
[email protected]

Reply via email to