[ https://issues.apache.org/jira/browse/MAHOUT-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845434#action_12845434 ]
Jake Mannix commented on MAHOUT-337: ------------------------------------ So a question about this: do we really want to do this? The original discussion on the mailing list was around being able to debug by string comparison of the json form, effectively, and obviously in general we should be encouraging users to do things like " if (myVector.equivalent(otherVector) ) as a way of checking this. The reason I ask is that in certainly M/R jobs, if that lengthsquared is ever calculated, and then the vector is used immutably from then on, if we actually let the serialized form keep the lengthSquared, it will be carried around and not recomputed in subsequent Reduce or further M/R steps after it's computed in some initial Map. Is there another way to "fix" the issue of two vectors, one which has the value cached, and the other which does not, but which are otherwise "equal", be actually properly compared (either by eye in JSON form, or otherwise). Maybe documentation can take care of this? > Don't serialize cached length squared in JSON vector representation > ------------------------------------------------------------------- > > Key: MAHOUT-337 > URL: https://issues.apache.org/jira/browse/MAHOUT-337 > Project: Mahout > Issue Type: Bug > Components: Math > Affects Versions: 0.3 > Reporter: Sean Owen > Assignee: Sean Owen > Priority: Minor > Fix For: 0.4 > > > The cached length-squared field in vectors should be marked transient so that > it is not part of the JSON serialized state. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.