This is an automated email from the git hooks/post-receive script. daube-guest pushed a commit to branch master in repository debian-med-benchmarking-spec.git.
commit d0c874ac79f05ed105bab2a0557e773f8d82e30d Author: Kevin Murray <[email protected]> Date: Fri Feb 5 14:36:37 2016 +0100 Inital from etherpad --- benchmarking.md | 105 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 105 insertions(+) diff --git a/benchmarking.md b/benchmarking.md new file mode 100644 index 0000000..b49396e --- /dev/null +++ b/benchmarking.md @@ -0,0 +1,105 @@ +Benchmarking CI Service +======================= + + +Brainstorm of Debian Med/SEQwiki/biotools benchmarking service + +This thing needs a name, ASAP! + +Possible datsets: + - https://sites.stanford.edu/abms/giab: Genome In A Bottle is a NIST human + NGS resequencing dataset (Paper: +http://biorxiv.org/content/early/2015/09/15/026468) + - http://gmisatest.referata.com/wiki/Dataset_1408MLGX6-3WGS + https://public.etherpad-mozilla.org/p/debian-med-benchmarking + +Similar to ReproducibleBuilds project +Machine that automatically defines a debian benchmark packages based on +metadata repository whenever: + - There is a new version of the underlying package in debian med unstable + - There is a change in an applicable script for the calculation of metrics + - There is a change in an applicable benchmark dataset + +The EDAM classification of a tool lives in the DebTags, and the CWL description +of a tool lives within the debian med package + +The definitions and annotations (EDAM) of the datasets and the scripts and EDAM +annotations of the calculation of the metrics live in a git repository and are +synched on a regular basis into a relational database living on the same server +that the packages are generated on. The pages on the SeqWiki are just a view +onto that github repository. The results of the benchmarks also live in the +same relational database and are also synched onto SeqWiki and reported to UDD. + +These packages live and are built on a server separate from the official Debian +repositories. + +Package definitions are automatically created based on the CWL descriptions of +the tools/toolchains that are to be benchmarked + +The packages are built by pbuilder + + +Requirements for packages in debian med that can be benchmarked: + - Have an EDAM-compatible DebTag + - Have a CWL file describing the included tool +On the benchmarking machine we have CWL descriptions for data converters, when +a tool is benchmarked we automatically build a path from the benchmark dataset +file format to the tool's input file format and then from the tool's output +file format to the metric calculation script's input file format + + +Autopkgtest? + + - It may be worth investigating using autopkgtest infrastruture (perhaps + run the service as a debci instance) that runs autopakcage tests: + + - each benchmark package contains a test (or tests) in autopkgtest format, + that we parse & use on our debci + + + +Metadata storage: + - Repository of YAML-style markup parsed into SQL? + - Debtags? + - Just write a script? + + +UDD: + + - https://udd.debian.org/dmd/?email1=debian-med-packaging%40lists.alioth.debian.org&email2=&email3=&packages=&ignpackages=&format=html#todo + + - Or, more sanely: + https://udd.debian.org/dmd/?email1=spam%40kdmurray.id.au&email2=&email3=&packages=&ignpackages=&format=html#todo + + - New table required for debian benchmarks status data + + - Errors in building a test or major changes in a metric are reported to + UDD in a fashion similar to how it is done by the ReproducibleBuilds +system: We report whether the test could be computed at all and the largest +positive and negative deviations of scores (in any dataset on any metric), plus +a description of in which dataset and on which metric this deviation has +occurred + + + +Architechture: + - Pre-package and publish to separate local archive (benchmarking specific + code ONLY): + - Metric script .debs + - dataset .debs + -Tests run in docker containers: + - Poll YAML or DB for build-deps and datasets + + - Install tool and tool deps from ftp.debian.org/debian + + - Install required evaluation/metric tools and dataset .debs from our own + archive + + - Create dockerfile for image from above (saved and published for every + benchmark) + - Run container of above image, producing data file results + - Run evaluation code and report result + - Delete container and image (keeping Dockerfile) + - Publish result (either text file [TSV, CSV or YAML], or cgi script to + pull from DB) + - Buider pushes status to UDD -- Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/debian-med-benchmarking-spec.git.git _______________________________________________ debian-med-commit mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/debian-med-commit
