[ 
https://issues.apache.org/jira/browse/MAHOUT-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836246#action_12836246
 ] 

Ted Dunning commented on MAHOUT-299:
------------------------------------

{quote}
Just wanted to check on this - I think the pattern below is the right one to 
use to catch exceptions in ObjectIntProcedures (from OutputCollector) in a map 
or reduce phase-- this look good to everyone else? (Not sure how the discussion 
on the list re: this ended up)
{noformat}
code that nests in a ISE, then rethrows the cause
{noformat}
{quote}
Yes.  That looks about right.

(visions of small proteins being helped across a blood brain barrier)

> Collocations: improve performance by making Gram BinaryComparable
> -----------------------------------------------------------------
>
>                 Key: MAHOUT-299
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-299
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Utils
>    Affects Versions: 0.3
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>             Fix For: 0.3
>
>         Attachments: MAHOUT-299.patch
>
>
> Robin's profiling indicated that a large portion of a run was spent in 
> readFields() in Gram due to the deserialization occuring as a part of Gram 
> comparions for sorting. He pointed me to BinaryComparable and the 
> implementation in Text.
> Like Text, in this new implementation, Gram stores its string in binary form. 
> When encoding the string at construction time we allocate an extra 
> character's worth of data to hold the Gram type information. When sorting 
> Grams, the binary arrays are compared instead of deserializing and comparing 
> fields.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to