> On Thu, Oct 22, 2009 at 12:30:06AM +0900, Charles Plessy wrote: > > First of all, let's summarise the situation. We want to integrate some > > metadata > > in our 'web sentinels', like > > 'http://debian-med.alioth.debian.org/tasks/bio'.
Dear Andreas and Olivier, thank you for your encouraging comments. I have made one more step forward, and upstream-metadata.debian.net now stores its information in a Berkeley database, refreshing only the data when it is older than a given age when it is accessed. For the moment, we only have 17 source packages that have an upstream-metadata.yaml file in their debian directory that is accessible through a public VCS. Nevertheless, I think that it is enough for a proof of principle. After resetting the database, I ‘loaded’ the data by accessing it: for package in bioperl clustalx mummer seaview perlprimer samtools dicomscope clustalw r-cran-combinat r-cran-haplo.stats r-cran-qvalue r-cran-randomforest r-cran-rocr r-other-bio3d mira bwa infernal ; do wget http://upstream-metadata.debian.net/$package/DOI -O /dev/stdout 2> /dev/null; done After loading, the resulting table are available here: http://upstream-metadata.debian.net/table/DOI Obviously, not all packages contain programs that have been described in an academic article (http://dx.doi.org/)… For the moment, one has to access an arbitrary key, but later the best would be to have a special key, for instance YAML-UPDATE, that would force the update. If it is possible to have a per-file commit hook, then each time a upstream-metadata.yaml is modified, the debian.net site can updated. Next step is to feed the UDD. For the moment, the site produces one table per keyword. The rationale is that for many keywords, the data will be too sparse to be interesting for the UDD. My current idea is to generate the tables for a limited set of curated keywords, assemble them (with the unix join command?), and give leave this in a public place that the UDD can read. In parallel, as Olivier suggested, each table could be exprorted in RDF format. But I am not sure I undersand it. Olivier, could you suggest a Perl module to use? As long as we are in a draft phase, I think that we can live with the currently biggest limitation: the lack of support for packages that are not stored in a VCS. One possible way to solve the problem is to provide repository, for instance in collab-maint on Alioth, where people can drop one yaml file per source packages. We could also unpack source files, as Andreas suggested. For the UDD import, what would be the most suitable among the two propositions of Andreas? > CREATE TABLE upstream-metadata ( > package text, > key1 text, > key2 text, > ... > keyN text, > PRIMARY KEY package > ); > CREATE TABLE upstream-metadata ( > package text, > key text, > value text, > PRIMARY KEY (package,key) > ); Since the addition of more meta-data to our source packages is a frequent issue raised on debian-devel, I think that there is a general interst for standardising ‘field’ names, whichever the technical solution that will be adopted. I will try to find a proper place on wiki.debian.org to let pepole document the fields they create, and if necessary discuss them. Have a nice day, -- Charles Plessy Debian Med packaging team, http://www.debian.org/devel/debian-med Tsurumi, Kanagawa, Japan -- To UNSUBSCRIBE, email to debian-med-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org