[ https://issues.apache.org/jira/browse/MAHOUT-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robin Anil updated MAHOUT-1236: ------------------------------- Fix Version/s: 1.0 +1 to protobuf tracking for the 1.0 release > Need a cleaned up serialized format for Vectors to handle names and all other > kinds of things > --------------------------------------------------------------------------------------------- > > Key: MAHOUT-1236 > URL: https://issues.apache.org/jira/browse/MAHOUT-1236 > Project: Mahout > Issue Type: Bug > Reporter: Ted Dunning > Fix For: 1.0 > > > Our current serialization is subject several ills > a) it breaks alignment by having a 1 byte flag field (evil, generic) > b) it doesn't handle any kind of extensible format like protobufs so it isn't > future-proof > c) it doesn't handle named vectors very well > d) it totally breaks with any other kind of decoration as with Centroids or > WeightedVector or ... (see b) > I propose that we use the current tag byte on the current serialization with > a new flag bit that indicates that the vector will use a protobuf encoding. > Then 3 bytes will be skipped to restore alignment. Then there will be a > protobuf encoding for the vector. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira