[ https://issues.apache.org/jira/browse/SINGA-210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15354718#comment-15354718 ]
ASF subversion and git services commented on SINGA-210: ------------------------------------------------------- Commit 62c6603ff7a3fe9f9749021e84ad9ec35f3fef7d in incubator-singa's branch refs/heads/dev from WANG Ji [ https://git-wip-us.apache.org/repos/asf?p=incubator-singa.git;h=62c6603 ] SINGA-210 Enable checkpoint and resume for v1.0 This ticket is going to add code for dumping the model parameters as checkpoint files, which could be used for fine-tuning and deployment. Serialize Tensor into TensorProto and save it in BinFile, which is stored as <prefix>.model, and generate description about parameters in <prefix>.desc. Unit test cases passed for kFloat, kInt and kDouble data type. > Enable checkpoint and resume for v1.0 > ------------------------------------- > > Key: SINGA-210 > URL: https://issues.apache.org/jira/browse/SINGA-210 > Project: Singa > Issue Type: New Feature > Reporter: wangwei > > This ticket is going to add code for dumping the model parameters as > checkpoint files, which could be used for fine-tuning and deployment. > The model parameters should be separated from model definition, i.e., net > construction. Users either random initialize the layer parameters or using > the parameters from checkpoint files after creating the neural net. In other > words, we do not add a pair of serializing and parsing functions in the Layer > class. > We need to decide the format of the checkpoint file and how to write and read > it: > 1. the checkpoint file consists of the model parameters, which could be > serialized as key-value pairs, where the key is the parameter name and value > is a protobuf object including the shape and values. Optionally, there could > be a text file including the parameter meta info, e..g, name and shape, which > would be useful for users to know the model parameters without parsing the > binary checkpoint file. > 2. the binary checkpoint file can be serialized using the Writer SINGA-202 > and loaded into memory using the Reader (SINGA-202). > 3. A checkpoint utility class should be implemented for 1 and 2. > Compatibility with caffe checkpoint files may also be considered to re-use > models from caffe model zoo http://caffe.berkeleyvision.org/model_zoo.html. > {code} > class Checkpoint { > // <prefix>.model is the binary file for parameter key-value pair; > // <prefix>.meta is the text file, one line per parameter. > Checkpoint(prefix, mode=[R|W]); > Read(); // read .model > ReadMeta() ; // read meta only > Get(key); // return the value protobuf obj. > GetMeta(key); > Read(key); > Write(key, value); // write to both .model and .meta files. > }; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)