[ 
https://issues.apache.org/jira/browse/MAHOUT-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836207#action_12836207
 ] 

Drew Farris commented on MAHOUT-299:
------------------------------------

Thanks for the review Sean, I'll get it committed sometime today if I can steal 
some time to do so (sometime this weekend, worst case).

Point taken about the static imports, I prefer the readability, but I probably 
rely on my IDE too much to track down references like that, so I'll remove them 
to conform with the overall style we're following.

Same about RuntimeException, will revsie that as well. Once those changes are 
complete, I'll commit and close the issue -- all in all it will be a great way 
to test my Karma. 


> Collocations: improve performance by making Gram BinaryComparable
> -----------------------------------------------------------------
>
>                 Key: MAHOUT-299
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-299
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Utils
>    Affects Versions: 0.3
>            Reporter: Drew Farris
>            Priority: Minor
>             Fix For: 0.3
>
>         Attachments: MAHOUT-299.patch
>
>
> Robin's profiling indicated that a large portion of a run was spent in 
> readFields() in Gram due to the deserialization occuring as a part of Gram 
> comparions for sorting. He pointed me to BinaryComparable and the 
> implementation in Text.
> Like Text, in this new implementation, Gram stores its string in binary form. 
> When encoding the string at construction time we allocate an extra 
> character's worth of data to hold the Gram type information. When sorting 
> Grams, the binary arrays are compared instead of deserializing and comparing 
> fields.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to