On Sun, Jan 15, 2012 at 10:09:06PM +0900, Charles Plessy wrote: > Sorry, the data was actually rotten for multiple reasons. First, the machine > running upstream-metadata.debian.net stopped keeping dep-src entries in its > sources.list, so debcheckout was not working anymore, and my rudimentary > scripts did not catch the error. I added error-catching to the TODO list. > Second, when packages change their repository URL, which is not supposed to > happen often, they have to be refreshed by hand. Third, I hardcoded the > erroneous git url git://git.debian.org/git that we now correct in > git://git.debian.org/. > > I have reloaded the data from scratch, by deleting the database and > running the following command for each package med-bio depends on. > > curl http://upstream-metadata.debian.net/$package/YAML-URL
Hmmm, I wonder in how far you consider only med-bio as a target for Ume(ga)ya? Several Debian Science packages would profit from this as well? While I assumed to have a brilliant idea to simply check on alioth find /git/debian-med -name upstream-metadata.yaml find /git/debian-science -name upstream-metadata.yaml I learned another trick of Git to hide the debian/ dir in the repository clone on Alioth. That's unfortunate for my idea. > I am now injecting all the fields related to bibliography. By the way, I > regret that I have put PMID and DOI outside the Reference-* namespace. > Would you mind if I correct this ? I remember that I was astonished about this decision but I simply assumed you would have your reasons. I don't mind fixing something which should be fixed before it has some relevant usage - so it should be fixed now. I assume you could simply iterate over everything in Reference which sounds quite reasonable. Would you try to care for fixing the existing upstream-metadata.yaml files in our repository or do you think we should do this step by step manually? > The file used for injection, > http://upstream-metadata.debian.net/for_UDD/biblio.yaml, > is valid YAML; this is why I managed to write the loader. It > is a serie of records, which all contain an array of three fields. > Altogether, they are loaded as a table of three columns. Well, I don't mind if you want to keep it this way. I'd consider it more complicated than I would have implemented it - but if you volunteer to maintain it that way that's perfectly fine for me. > upstream-metadata.debian.net stores its data in a Berkeley database, where the > field names are the concatenation of the package name and the > upstream-metadata.yaml field name, that is, if in the perlprimer package, > there > is “PMID: 15073005”, the Berkeley DB will contain “15073005” for the field > “perlprimer:PMID”. In the whole information chain, the structure is always > ‘package - field - value’. > > I do not know where the perlprimer duplicaion came. Perhaps there was an > invisible character somewhere ? On the server side, there is a command line > tool to manipulate field values directly, I may have done a typo when making > tests. This said, I agree that the output should be sanitized. Also, I > definitely agree to use PRIMARY KEY (package,key) as an extra safety net. > Should it be added to udd/sql/bibref.sql ? Yes. Just put it there and ping me if you added means to cope with injection problems because of this. I can push it to UDD after testing on blends.d.n. Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: http://lists.debian.org/[email protected]

