On 5/5/11 5:57 PM, Chris Collins wrote:
That is a good idea, I would also consider including a few other optional fields and
making it human readable. In the system I work on all our data gets this type of
"body tag", we include other things like:
- machine it was built on and perhaps the os user that did the run.
- build date
- source path to where the input data (in this case training set)
- maybe a hash of the training set.
- major/ minor version number
- maybe the training tool allows you to pass a set of arbitrary key value pairs
this way the above could be defined in an ant script or what have you.
This way when you find this model sitting a disk some day you can actually
figure out if you trust it. Nothing like going into production with something
like this to find it was something built on your interns laptop just as a test
that everyone forgot about.
That just sounds like what we already write into the model, expect the
machine name, OS and user.
The model itself is a zip package, and includes a manifest which
includes these values.
Maybe we should extend the cmd line tooling to display it, then you do
not need to unpack
the zip package.
Jörn