[ https://issues.apache.org/jira/browse/MAHOUT-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836246#action_12836246 ]
Ted Dunning commented on MAHOUT-299: ------------------------------------ {quote} Just wanted to check on this - I think the pattern below is the right one to use to catch exceptions in ObjectIntProcedures (from OutputCollector) in a map or reduce phase-- this look good to everyone else? (Not sure how the discussion on the list re: this ended up) {noformat} code that nests in a ISE, then rethrows the cause {noformat} {quote} Yes. That looks about right. (visions of small proteins being helped across a blood brain barrier) > Collocations: improve performance by making Gram BinaryComparable > ----------------------------------------------------------------- > > Key: MAHOUT-299 > URL: https://issues.apache.org/jira/browse/MAHOUT-299 > Project: Mahout > Issue Type: Improvement > Components: Utils > Affects Versions: 0.3 > Reporter: Drew Farris > Assignee: Drew Farris > Priority: Minor > Fix For: 0.3 > > Attachments: MAHOUT-299.patch > > > Robin's profiling indicated that a large portion of a run was spent in > readFields() in Gram due to the deserialization occuring as a part of Gram > comparions for sorting. He pointed me to BinaryComparable and the > implementation in Text. > Like Text, in this new implementation, Gram stores its string in binary form. > When encoding the string at construction time we allocate an extra > character's worth of data to hold the Gram type information. When sorting > Grams, the binary arrays are compared instead of deserializing and comparing > fields. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.