Re: RRID update on salsa on packages starting with A+B
On Fri, Apr 06, 2018 at 02:22:40PM +0200, Steffen Möller wrote: > Whenever you find the moment, please review the presentation of > > rate4site $ yamllint metadata metadata 1:1 warning missing document start "---" (document-start) 7:8 errortrailing spaces (trailing-spaces) 10:2 errorsyntax error: could not find expected ':' > sambamba $ git diff diff --git a/debian/upstream/metadata b/debian/upstream/metadata index ee8300c..b4afe9d 100644 --- a/debian/upstream/metadata +++ b/debian/upstream/metadata @@ -14,7 +14,7 @@ Reference: 10.1093/bioinformatics/btv098" eprint: "https://academic.oup.com/bioinformatics/article-pdf/\ 31/12/2032/568750/btv098.pdf" -Registries: +Registry: - Name: bio.tools Entry: Sambamba - Name: OMICtools > on the task page which do not show their references. I would love if someone else than me would take over proof-reading this kind of issues before the code is suspected to be wrong (= I will not check the other two packages of your other mail right now). Thanks for anybody who helps in enhancing metadata but I would love to concentrate on bug fixing of packages. Thank you Andreas. -- http://fam-tille.de
Re: RRID update on salsa on packages starting with A+B
On 4/6/18 2:22 PM, Steffen Möller wrote: Hi Andreas, Whenever you find the moment, please review the presentation of rate4site sambamba genometools maffilter on the task page which do not show their references. Cheers, Steffen
Re: RRID update on salsa on packages starting with A+B
Hi Andreas, Whenever you find the moment, please review the presentation of rate4site sambamba on the task page which do not show their references. Cheers, Steffen
Re: d/u/metadata for binary or source tree Re: RRID update on salsa on packages starting with A+B
On Wed, Apr 04, 2018 at 01:56:42PM +0200, Steffen Möller wrote: > > > https://blends.debian.org/blends/apa.html#datagathering > > > > (A.7.) could be a first shot on this problem. > One half done. The invitation to contribute to salsa I have no idea what you might consider *Blend* *specific* in using Salsa. Everything worked before Salsa and the fact that Git repositories now can be changed via web form has nothing to do with Blends. Am I missing something? > and the screenshots and translations is missing. That's perfectly generic Debian stuff. Well, I admit I wrote the UDD importers for screenshots and translations because I intended to use it on the tasks pages. But it is used in very different places for all Debian packages. So what exactly do you want to be documented here? (Patches welcome) > > That's not about branches. As previously said: Registry data are per > > source package currently. There is no means in the registry data table > > to map an entry to a binary package. Thus autodock *and* autogrid are > > missing registry data both. > > There was one d/u/metadata file that assigned different references IIRC to > different binaries of that source package by inventing a sub-hierarchy. Can you please re-read https://lists.debian.org/debian-med/2018/03/msg00119.html Seek for "For citations we are using the field Debian-package" > This > seems to indicate that this is not only an issue for the registry entries. > How about the following: > > * d/u/metadata always refers to the source tree as a whole That's the case. > * d/u/metadata also refers to the binary with the same name as the source > tree. That's also the case (more or less unintended but it is working like this). > * information in d/u/metadata associated with a particular set of binaries > with the same or different names shall be listed in a "binary: " attribute See my posting. > Admittedly, I see some issues with that. The notion of a reference binary > for a package with many binaries would be helpful for Debian in general. > This would prevent something like the Massxpert entry on our task list that > describes itself as a transition package. Hmmm, I do not understand what you mean. Fixing the massxpert package is only one commit away: https://salsa.debian.org/blends-team/med/commit/6451da2d816f762cf391f048676c79eee1c25431 Please, if anybody notices this, just fix the according task instead of assuming intentional displaying cruft. Its not *that* hard to do to replace an outdated package name by a correct one. > But this also means that such > notion about, say, "end-user relevant" packages vs technical helper packages > because of arch-independency or libraries etc, should be declared in > d/control. I guess you want to invent some extra layer of complexity for d/control which will never be accepted to solve a problem for Registry entries which was just solved for citations. We just need a *decision* I was asking for in the end of the mail I was linking to above. > Another concern of mine is that we typically do not tag any data for being > specific for something. We put it in different files. As such, we would need > d/u/.metadata. Having data relevant for *binary* packages in a dir named *upstream* does not sound very logical for me. > For d/u/edam we had the concept of a summary representing the packages as a > whole and then subsections for every binary. I admit I do not fully understand the edam data. I guess the only reason that we are not facing issues with these data is that they are just stored in UDD and the only (non-)use case is some query that exports the data again. I doubt that anybody has done some QA on the output. > I kind of liked it the way it > is, but maybe separating that out to different files would also be > appropriate. Could there be an option to do it either way? Again: I think that what we implemented for citations (Debian-Package field) which is documented in Wiki[1] is absolutely sufficient to solve the problem we have. I see no reason to invent something else. The only thing that we should think about is the data storage model: Do we use another column as in the bibref table or should the importer translate the data right to binary packages according to the algorithm above: If source package = binary package if nothing else is specified. Use Debian-Package as binary package name otherwise. It depends a bit from the applications that will consume that data. For the moment it is only the Blends web sentinel which is rather binary package centric. The bibref table layout is rather a hack I created afterwards after we were running in the perfectly same problem as we have now with references. May be a binary package based table layout is more sensible. > I cannot judge how much of a hassle it is to fiddle anything like that into > the UDD. What do you think? I keep on thinking that the established method is not much of a hassle
Re: RRID update on salsa on packages starting with A+B
Hi Tony, On 4/3/18 1:59 PM, Tony Travis wrote: On 03/04/18 11:27, Steffen Möller wrote: [...] This is a bit of a side-track from the core of this thread. Maybe we would have Tony describing our achievements one they are manifesting in Bio-Linux. And if we have a look at https://salsa.debian.org/dashboard/milestones ? Maybe we can use those as an anchor for achievements from which news could be generated in an easier way, not necessarily by us. [...] Yes, I would be happy to do that. Yippee. I see five bits, there may be more: A) New bits for Bio-Linux that its new release features B) New bits for Bio-Linux that shall be coming at some future point C) New bits for Debian and its derivative distributions that have been established D) New bits for Debian and its derivative distributions that are currently being worked on E) New bits for Debian and its derivatives that are currently being discussed on the Sprint(s) or mailing list(s). In an ideal world we would see a lot moving from E to A and from B to E, so the effort to describe this all would be mostly done only once :) I know, it will not work, let alone because the audiences are different. In my PoV there is not too much of a conceptional difference between porting to Bio-Linux and having backports within Debian. We need to address both. Just who should do all that. Since Debian Med takes about every opportunity that it is just a regular part of Debian, we somehow do not have any home page since there is the Debian home page already. As such there is also no such thing like a page where to distribute any Debian Med-specific news. Closest is possibly https://blends.debian.org/med/ and http://debian-med.alioth.debian.org/ . Nobody truly wants to read through our Wiki page on https://wiki.debian.org/DebianMed . We have http://debianmed.blogspot.de/ with latest news from 2014. So, I have no immediate idea about how to improve on it all. I'm now only doing minimal maintenance work on Bio-Linux 8, based on Ubuntu 14.04 LTS (Trusty) while I start work on Bio-Linux 9, based on Ubuntu-MATE 18.04 LTS (Bionic). I've got the 18.04 Ubuntu-MATE beta .iso and I've started work remastering it for Bio-Linux using "customizer": https://github.com/kamilion/customizer I cannot judge. The readme reads fine. I plan to base Bio-Linux 9 on Debian-Med + Bioconda That is good. Please consider to also install the singularity and docker clients. And the cwltool. and would like to start adding Ubuntu 18.04 (Bionic) versions of Debian-Med packages to the Debian-Med PPA if nobody objects. I also want to drop ALL the 'bad' NEBC packages (i.e. the binary-only packages that Bela and Tim used initially to migrate from Debian testing (Sarge) to Ubuntu 6.06 LTS. Yip. I've listened to Andreas: I intend to create a "bio-linux-desktop" meta-package modelled on the Debian-Med/Bio* tasks that are now being updated to include some 'missing' packages that were only in Bio-Linux. We should think about adding the version in Ubuntu and Bio-Linux to show up among the versions in the task pages. I would not know how to implement that, admittedly. Best, Steffen
d/u/metadata for binary or source tree Re: RRID update on salsa on packages starting with A+B
On 4/3/18 5:09 PM, Andreas Tille wrote: On Tue, Apr 03, 2018 at 12:27:08PM +0200, Steffen Möller wrote: There is a daily cron job parsing Salsa directories. Fine. Somewhere there is (or should be :o) ) a documentation how this page is crafted. On our Wiki? Let us then have a link to that page. May be this https://blends.debian.org/blends/apa.html#datagathering (A.7.) could be a first shot on this problem. One half done. The invitation to contribute to salsa and the screenshots and translations is missing. Could branches for your cron job's autodock checkout differ? The page was updated this morning but yet not references. Or is there a second directory "autodock" when the source package name is "autodocksuite" (because of the joint autogrid tool)? That's not about branches. As previously said: Registry data are per source package currently. There is no means in the registry data table to map an entry to a binary package. Thus autodock *and* autogrid are missing registry data both. There was one d/u/metadata file that assigned different references IIRC to different binaries of that source package by inventing a sub-hierarchy. This seems to indicate that this is not only an issue for the registry entries. How about the following: * d/u/metadata always refers to the source tree as a whole * d/u/metadata also refers to the binary with the same name as the source tree. * information in d/u/metadata associated with a particular set of binaries with the same or different names shall be listed in a "binary: " attribute Admittedly, I see some issues with that. The notion of a reference binary for a package with many binaries would be helpful for Debian in general. This would prevent something like the Massxpert entry on our task list that describes itself as a transition package. But this also means that such notion about, say, "end-user relevant" packages vs technical helper packages because of arch-independency or libraries etc, should be declared in d/control. Another concern of mine is that we typically do not tag any data for being specific for something. We put it in different files. As such, we would need d/u/.metadata. For d/u/edam we had the concept of a summary representing the packages as a whole and then subsections for every binary. I kind of liked it the way it is, but maybe separating that out to different files would also be appropriate. Could there be an option to do it either way? I cannot judge how much of a hassle it is to fiddle anything like that into the UDD. What do you think? Best, Steffen
Re: RRID update on salsa on packages starting with A+B
Hi Tony, On Tue, Apr 03, 2018 at 12:59:46PM +0100, Tony Travis wrote: > > [...] > > This is a bit of a side-track from the core of this thread. Maybe we > > would have Tony describing our achievements one they are manifesting in > > Bio-Linux. And if we have a look at > > https://salsa.debian.org/dashboard/milestones ? Maybe we can use those > > as an anchor for achievements from which news could be generated in an > > easier way, not necessarily by us. > > Yes, I would be happy to do that. I admit I do not mind about some proposed technique. I think the person who is doing the actual publicity work should freely choose the channel for communication. > I plan to base Bio-Linux 9 on Debian-Med + Bioconda and would like to > start adding Ubuntu 18.04 (Bionic) versions of Debian-Med packages to > the Debian-Med PPA if nobody objects. I also want to drop ALL the 'bad' > NEBC packages (i.e. the binary-only packages that Bela and Tim used > initially to migrate from Debian testing (Sarge) to Ubuntu 6.06 LTS. Please let us know if we should enforce packaging efforts for certain packages you consider very important. As I have seen for some older packages the licencing is frequently the only blocker, technical issues are easy to solve. > I've listened to Andreas: I intend to create a "bio-linux-desktop" > meta-package modelled on the Debian-Med/Bio* tasks that are now being > updated to include some 'missing' packages that were only in Bio-Linux. Feel free to discuss this here. > Thanks again to all in the Debian-Med team for your help and support, You are welcome Andreas. -- http://fam-tille.de
Re: RRID update on salsa on packages starting with A+B
On Tue, Apr 03, 2018 at 12:27:08PM +0200, Steffen Möller wrote: > > There is a daily cron job parsing Salsa directories. > Fine. Somewhere there is (or should be :o) ) a documentation how this page > is crafted. On our Wiki? Let us then have a link to that page. May be this https://blends.debian.org/blends/apa.html#datagathering (A.7.) could be a first shot on this problem. > Could branches for your cron job's autodock checkout differ? The page was > updated > this morning but yet not references. Or is there a second directory > "autodock" when > the source package name is "autodocksuite" (because of the joint autogrid > tool)? That's not about branches. As previously said: Registry data are per source package currently. There is no means in the registry data table to map an entry to a binary package. Thus autodock *and* autogrid are missing registry data both. Kind regards Andreas. -- http://fam-tille.de
Re: RRID update on salsa on packages starting with A+B
On 03/04/18 11:27, Steffen Möller wrote: > [...] > This is a bit of a side-track from the core of this thread. Maybe we > would have Tony describing our achievements one they are manifesting in > Bio-Linux. And if we have a look at > https://salsa.debian.org/dashboard/milestones ? Maybe we can use those > as an anchor for achievements from which news could be generated in an > easier way, not necessarily by us. Hi, Steffen. Yes, I would be happy to do that. I'm now only doing minimal maintenance work on Bio-Linux 8, based on Ubuntu 14.04 LTS (Trusty) while I start work on Bio-Linux 9, based on Ubuntu-MATE 18.04 LTS (Bionic). I've got the 18.04 Ubuntu-MATE beta .iso and I've started work remastering it for Bio-Linux using "customizer": > https://github.com/kamilion/customizer I plan to base Bio-Linux 9 on Debian-Med + Bioconda and would like to start adding Ubuntu 18.04 (Bionic) versions of Debian-Med packages to the Debian-Med PPA if nobody objects. I also want to drop ALL the 'bad' NEBC packages (i.e. the binary-only packages that Bela and Tim used initially to migrate from Debian testing (Sarge) to Ubuntu 6.06 LTS. I've listened to Andreas: I intend to create a "bio-linux-desktop" meta-package modelled on the Debian-Med/Bio* tasks that are now being updated to include some 'missing' packages that were only in Bio-Linux. Thanks again to all in the Debian-Med team for your help and support, Tony. -- Minke Informatics Limited, Registered in Scotland - Company No. SC419028 Registered Office: 3 Donview, Bridge of Alford, AB33 8QJ, Scotland (UK) tel. +44(0)19755 63548http://minke-informatics.co.uk mob. +44(0)7985 078324mailto:tony.tra...@minke-informatics.co.uk
Re: RRID update on salsa on packages starting with A+B
On 4/3/18 10:30 AM, Andreas Tille wrote: On Tue, Apr 03, 2018 at 12:14:57AM +0200, Steffen Möller wrote: Registry: - Name: OMICtools Entry: NA - Name: RRID Entry: NA - Name: bio.tools Entry: NA That is an interesting one. Please kindly check https://salsa.debian.org/med-team/autodocksuite/blob/master/debian/upstream/metadata which on my side shows Registry: - Name: OMICtools Entry: OMICS_19997 - Name: SciCrunch Entry: SCR_012746 - Name: bio.tools Entry: AutoDock May be you forgot to push. I now received commit e5292af2136df30df8ea2a0da0dbeba6b82b027b (HEAD -> master, origin/master) Author: Steffen Möller Date: Mon Mar 26 16:05:33 2018 + Added RRID to metadata which has the values you are mentioning. This was not available in the public repository (at least until 1st of April). Hm. No. Have not touched it again. May be some delay-thingy somewhere. I remember it was not your prefered solution but for the moment 'NA' values are not stored. [...] How would you like Name and no fancy colouring? The idea was to inform the world (and ourselves) that we have checked. I admit I do not like this. The thing is that the maintainers of the repository could have updated their data. Recent development shows that this is a very probable action these days. I do not think that the fact that we have checked is no information which is valuable for a random visitor of the tasks page and it might be simply wrong since the data were updated. So I keep on thinking that 'NA' is not anything interesting for the page. For us as developers we can easily do grep -w 'NA' */debian/upstream/metadata if we want to find the information. Right. If we developers have the whole of Debian Med checked out. I don't. It is a side-issue. Don't worry. The problem is that we are unlikely to be informed about an entry being added to the registry, so we would look bad. But since the advent of salsa.d.o I am tempted to risk that. Sorry, I do not understand this. We have that mistake only for a few days upon notification. bio-express (https://salsa.debian.org/med-team/express) It is rather berkeley-express. This works now[1]. The reason for the problem was that the repository name is different from the source package name. The importer was simply wrong for these cases. This is fixed now (hopefully!) but we should definitely avoid this kind of divergence and I'm tempted to adapt the repository name to the source package name once we migrate anonscm.d.o to salsa.d.o (I've lost hope for some sensible solution to keep anonscm working :-(((). Aaah. Yes. This sounds like a reasonable addition to the Debian Med policy. We are intuitively doing it in close to all of our packages but specifying it explicitly in policy makes sense, definitely. gromacs No Registry entries in d/upstream/metadata - so nothing to display here. :o/ Because that is debichem and looked at my not-yet-merged own branch. Why not asking for membership in Debichem and commit directly? I was with debichem on alioth and eventually will ask for membership again. However, I would loose the external view on how we present ourselves. And I am much after attracting casual contributors to our cause, e.g. to improve the description of our packages or ... registry entries in d/u/metadata. So, debichem is a test-case to teach me how this feels on the other side. This document is prepared every 24 hours for Debian packages selected in the https://salsa.debian.org/blends-team/med/blob/master/tasks/bio";>Med Bio Task Description withinformation in the Ultimate Debian Database (https://udd.debian.org";>UDD). The bottom line has a creation date of the web page. Currently it says: Last update: Mon, 02 Apr 2018 19:21:37 - Excellent. I have missed that one. It was last updated at [Date+Time]. This information is not available, sorry. I am uncertain about what "this" refers to, but maybe you have some idea about who the page can explain itself to the ones who want to contribute code or content. There is a daily cron job parsing Salsa directories. Fine. Somewhere there is (or should be :o) ) a documentation how this page is crafted. On our Wiki? Let us then have a link to that page. Casual contributors, that is what I am primarily after. For everything else I happily ask you or Ole. I do not keep track on the information when this job is finished. The UDD importer is reading the result of this job later and I also do not keep track when this job ends. So we have two uncertain times - the only time that can be safely displayed is when the web pages are displayed which is done on the bottom line. I admit that I'm not very motivated to make some effort to keep track of the other times compared to other tasks on my desk. That is fine. There could be a line like "Please allow 48 hours for your change to the package selection or the package description to have an effect on th
Re: RRID update on salsa on packages starting with A+B
On Tue, Apr 03, 2018 at 12:14:57AM +0200, Steffen Möller wrote: > > > Registry: > > - Name: OMICtools > > Entry: NA > > - Name: RRID > > Entry: NA > > - Name: bio.tools > > Entry: NA > > That is an interesting one. Please kindly check > > https://salsa.debian.org/med-team/autodocksuite/blob/master/debian/upstream/metadata > > which on my side shows > > Registry: > - Name: OMICtools > Entry: OMICS_19997 > - Name: SciCrunch > Entry: SCR_012746 > - Name: bio.tools > Entry: AutoDock May be you forgot to push. I now received commit e5292af2136df30df8ea2a0da0dbeba6b82b027b (HEAD -> master, origin/master) Author: Steffen Möller Date: Mon Mar 26 16:05:33 2018 + Added RRID to metadata which has the values you are mentioning. This was not available in the public repository (at least until 1st of April). > > I remember it was not your prefered solution but for the moment 'NA' > > values are not stored. [...] > How would you like Name and no fancy colouring? > > The idea was to inform the world (and ourselves) that we have checked. I admit I do not like this. The thing is that the maintainers of the repository could have updated their data. Recent development shows that this is a very probable action these days. I do not think that the fact that we have checked is no information which is valuable for a random visitor of the tasks page and it might be simply wrong since the data were updated. So I keep on thinking that 'NA' is not anything interesting for the page. For us as developers we can easily do grep -w 'NA' */debian/upstream/metadata if we want to find the information. > The > problem is that we are unlikely to be informed about an entry being added to > the registry, so we would look bad. But since the advent of salsa.d.o I am > tempted to risk that. Sorry, I do not understand this. > > > bio-express (https://salsa.debian.org/med-team/express) > > It is rather berkeley-express. This works now[1]. The reason for the > > problem was that the repository name is different from the source > > package name. The importer was simply wrong for these cases. This is > > fixed now (hopefully!) but we should definitely avoid this kind of > > divergence and I'm tempted to adapt the repository name to the source > > package name once we migrate anonscm.d.o to salsa.d.o (I've lost hope > > for some sensible solution to keep anonscm working :-(((). > Aaah. Yes. This sounds like a reasonable addition to the Debian Med policy. We are intuitively doing it in close to all of our packages but specifying it explicitly in policy makes sense, definitely. > > > gromacs > > No Registry entries in d/upstream/metadata - so nothing to display here. > :o/ Because that is debichem and looked at my not-yet-merged own branch. Why not asking for membership in Debichem and commit directly? > > > This document is prepared every 24 hours for Debian packages selected in > > > the > > > > > href="https://salsa.debian.org/blends-team/med/blob/master/tasks/bio";>Med > > > Bio Task Description withinformation in the Ultimate Debian Database > > > ( > > href="https://udd.debian.org";>UDD). > > The bottom line has a creation date of the web page. Currently it says: > > > > Last update: Mon, 02 Apr 2018 19:21:37 - > Excellent. I have missed that one. > > > It was last updated at [Date+Time]. > > This information is not available, sorry. > > I am uncertain about what "this" refers to, but maybe you have some idea > about who the page can explain itself to the ones who want to contribute > code or content. There is a daily cron job parsing Salsa directories. I do not keep track on the information when this job is finished. The UDD importer is reading the result of this job later and I also do not keep track when this job ends. So we have two uncertain times - the only time that can be safely displayed is when the web pages are displayed which is done on the bottom line. I admit that I'm not very motivated to make some effort to keep track of the other times compared to other tasks on my desk. > > There is no difference between pushing from remote repository and a web > > interface edit. > > Good. > > Curious about this autodock/autodocksuite thingy and maybe there is some > chance to have this self-explaining bit at the end. I have no idea how this can happen. I do not use the web interface for editing package data. > I have added registry links to about all green entries in bio now. Let us > review those over the next weeks a bit and then think about an announcing it > with the Debian News or whatever the list finds appropriate. I'd love if we could find some default channel for Debian Med news (and a person who feels dedicated to feed this channel regularly). Kind regards Andreas. -- http://fam-tille.de
Re: RRID update on salsa on packages starting with A+B
On 4/2/18 11:32 PM, Andreas Tille wrote: Hi Steffen, thanks a lot for your continuous work on assembling registry data. On Sat, Mar 31, 2018 at 02:41:42AM +0200, Steffen Möller wrote: There are still packages that are not updated. Worthwhile candidates are autodock This has: Registry: - Name: OMICtools Entry: NA - Name: RRID Entry: NA - Name: bio.tools Entry: NA That is an interesting one. Please kindly check https://salsa.debian.org/med-team/autodocksuite/blob/master/debian/upstream/metadata which on my side shows Registry: - Name: OMICtools Entry: OMICS_19997 - Name: SciCrunch Entry: SCR_012746 - Name: bio.tools Entry: AutoDock I remember it was not your prefered solution but for the moment 'NA' values are not stored. [...] How would you like Name and no fancy colouring? The idea was to inform the world (and ourselves) that we have checked. The problem is that we are unlikely to be informed about an entry being added to the registry, so we would look bad. But since the advent of salsa.d.o I am tempted to risk that. bio-express (https://salsa.debian.org/med-team/express) It is rather berkeley-express. This works now[1]. The reason for the problem was that the repository name is different from the source package name. The importer was simply wrong for these cases. This is fixed now (hopefully!) but we should definitely avoid this kind of divergence and I'm tempted to adapt the repository name to the source package name once we migrate anonscm.d.o to salsa.d.o (I've lost hope for some sensible solution to keep anonscm working :-(((). Aaah. Yes. This sounds like a reasonable addition to the Debian Med policy. Bio-eagle Works now[2]. Same as above. Thanks. garli Works (and as far as I can see it has also worked before - no idea why it is on your list). Most likely I did not wait long enough prior to reporting. gdpc Same here - just works. gromacs No Registry entries in d/upstream/metadata - so nothing to display here. :o/ Because that is debichem and looked at my not-yet-merged own branch. a problem for me to report them is that I am uncertain about the time that needs to be passed until the update is run. Could you please help with a little line at the bottom alike: This document is prepared every 24 hours for Debian packages selected in the https://salsa.debian.org/blends-team/med/blob/master/tasks/bio";>Med Bio Task Description withinformation in the Ultimate Debian Database (https://udd.debian.org";>UDD). The bottom line has a creation date of the web page. Currently it says: Last update: Mon, 02 Apr 2018 19:21:37 - Excellent. I have missed that one. It was last updated at [Date+Time]. This information is not available, sorry. I am uncertain about what "this" refers to, but maybe you have some idea about who the page can explain itself to the ones who want to contribute code or content. The good part of the problem is that whenever I check if something was updated, I'll add RRIDs to another package. This typically is from within salsa. Is that possibly not triggering the same routines as when I push from a remote-from-salsa repository? There is no difference between pushing from remote repository and a web interface edit. Good. Curious about this autodock/autodocksuite thingy and maybe there is some chance to have this self-explaining bit at the end. I have added registry links to about all green entries in bio now. Let us review those over the next weeks a bit and then think about an announcing it with the Debian News or whatever the list finds appropriate. Best, Steffen
Re: RRID update on salsa on packages starting with A+B
Hi Steffen, thanks a lot for your continuous work on assembling registry data. On Sat, Mar 31, 2018 at 02:41:42AM +0200, Steffen Möller wrote: > There are still packages that are not updated. Worthwhile candidates > are > > autodock This has: Registry: - Name: OMICtools Entry: NA - Name: RRID Entry: NA - Name: bio.tools Entry: NA I remember it was not your prefered solution but for the moment 'NA' values are not stored. The importer log says: 2018-04-01 14:57:13,012 - DEBUG - (125): Registry data found for source 'autodocktools' of debian-med: registry = [{'Entry': 'NA', 'Name': 'OMICtools'}, {'Entry': 'NA', 'Name': 'RRID'}, {'E ntry': 'NA', 'Name': 'bio.tools'}] 2018-04-01 14:57:13,013 - INFO - (364): Registry entry OMICtools from source autodocktools was removed since it is NA. 2018-04-01 14:57:13,013 - INFO - (364): Registry entry RRID from source autodocktools was removed since it is NA. 2018-04-01 14:57:13,013 - INFO - (364): Registry entry bio.tools from source autodocktools was removed since it is NA. 2018-04-01 14:57:13,013 - DEBUG - (415): No registry data for source 'autodocktools' of debian-med: upstream.registry = [] So this works as expected. > bio-express (https://salsa.debian.org/med-team/express) It is rather berkeley-express. This works now[1]. The reason for the problem was that the repository name is different from the source package name. The importer was simply wrong for these cases. This is fixed now (hopefully!) but we should definitely avoid this kind of divergence and I'm tempted to adapt the repository name to the source package name once we migrate anonscm.d.o to salsa.d.o (I've lost hope for some sensible solution to keep anonscm working :-(((). > Bio-eagle Works now[2]. Same as above. > garli Works (and as far as I can see it has also worked before - no idea why it is on your list). > gdpc Same here - just works. > gromacs No Registry entries in d/upstream/metadata - so nothing to display here. > a problem for me to report them is that I am uncertain about the time that > needs > to be passed until the update is run. Could you please help with a little > line at the bottom alike: > > This document is prepared every 24 hours for Debian packages selected in the > https://salsa.debian.org/blends-team/med/blob/master/tasks/bio";>Med > Bio Task Description withinformation in the Ultimate Debian Database ( href="https://udd.debian.org";>UDD). The bottom line has a creation date of the web page. Currently it says: Last update: Mon, 02 Apr 2018 19:21:37 - > It was last updated at [Date+Time]. This information is not available, sorry. > The good part of the problem is that whenever I check if something was > updated, I'll add RRIDs to another package. This typically is from within > salsa. Is that possibly not triggering the same routines as when I push from > a remote-from-salsa repository? There is no difference between pushing from remote repository and a web interface edit. Kind regards Andreas. [1] https://blends.debian.org/med/tasks/bio#berkeley-express [2] https://blends.debian.org/med/tasks/bio#bio-eagle -- http://fam-tille.de
Re: RRID update on salsa on packages starting with A+B
Hi Andreas, On 3/30/18 3:14 PM, Andreas Tille wrote: Hi Steffen, On Fri, Mar 30, 2018 at 01:11:28AM +0200, Andreas Tille wrote: Please kindly have a look at clonalframe - it has a ref for SciCrunch, is yamllint clean but nothing shows up. H, good catch. clonalframe | SciCrunch | SCR_016060 is in UDD. I have no idea why it is not displayed on the tasks page. Needs further investigation. Fixed now. Happy here. :) I realised that importer and tasks pages generation scripts were all perfectly working. However, for a reason I did not understand the gatherer was not run on the main UDD host. I need to keep an eye on this. There are still packages that are not updated. Worthwhile candidates are autodock bio-express (https://salsa.debian.org/med-team/express) Bio-eagle garli gdpc gromacs a problem for me to report them is that I am uncertain about the time that needs to be passed until the update is run. Could you please help with a little line at the bottom alike: This document is prepared every 24 hours for Debian packages selected in the href="https://salsa.debian.org/blends-team/med/blob/master/tasks/bio";>Med Bio Task Description withinformation in the Ultimate Debian Database (https://udd.debian.org";>UDD). It was last updated at [Date+Time]. The good part of the problem is that whenever I check if something was updated, I'll add RRIDs to another package. This typically is from within salsa. Is that possibly not triggering the same routines as when I push from a remote-from-salsa repository? Best, Steffen
Re: RRID update on salsa on packages starting with A+B
Hi Steffen, On Fri, Mar 30, 2018 at 01:11:28AM +0200, Andreas Tille wrote: > > Please kindly have a look at clonalframe - it has a ref > > for SciCrunch, is yamllint clean but nothing shows up. > > H, good catch. > > clonalframe | SciCrunch | SCR_016060 > > is in UDD. I have no idea why it is not displayed on the tasks page. > Needs further investigation. Fixed now. I realised that importer and tasks pages generation scripts were all perfectly working. However, for a reason I did not understand the gatherer was not run on the main UDD host. I need to keep an eye on this. Kind regards Andreas. -- http://fam-tille.de
Re: RRID update on salsa on packages starting with A+B
On Thu, Mar 29, 2018 at 08:48:01PM +0200, Steffen Möller wrote: > > > > But I found artemis > > > (https://salsa.debian.org/med-team/artemis/tree/master/debian/upstream) > > > which was not touched for a while and looks syntactically just fine but > > > has > > > its RRIDs not shown. > > Please kindly have a look at clonalframe - it has a ref > for SciCrunch, is yamllint clean but nothing shows up. H, good catch. clonalframe | SciCrunch | SCR_016060 is in UDD. I have no idea why it is not displayed on the tasks page. Needs further investigation. And flexbar is also fully read: flexbar | OMICtools | OMICS_01087 flexbar | bio.tools | flexbar flexbar | SciCrunch | SCR_013001 I need to check this in the next couple of days. Kind regards Andreas. -- http://fam-tille.de
Re: RRID update on salsa on packages starting with A+B
On 3/29/18 8:48 PM, Steffen Möller wrote: On 3/28/18 1:14 PM, Andreas Tille wrote: On Wed, Mar 28, 2018 at 12:03:54PM +0200, Steffen Möller wrote: But I found artemis (https://salsa.debian.org/med-team/artemis/tree/master/debian/upstream) which was not touched for a while and looks syntactically just fine but has its RRIDs not shown. Please kindly have a look at clonalframe - it has a ref for SciCrunch, is yamllint clean but nothing shows up. For flexbar it is only the missing \n that is missing. Cheers, Steffen
Re: RRID update on salsa on packages starting with A+B
On 3/28/18 1:14 PM, Andreas Tille wrote: On Wed, Mar 28, 2018 at 12:03:54PM +0200, Steffen Möller wrote: But I found artemis (https://salsa.debian.org/med-team/artemis/tree/master/debian/upstream) which was not touched for a while and looks syntactically just fine but has its RRIDs not shown. Please kindly have a look at clonalframe - it has a ref for SciCrunch, is yamllint clean but nothing shows up. Many thanks Steffen
Re: RRID update on salsa on packages starting with A+B
Hi Steffen, hi Andreas, 2018-03-28 14:16 GMT+02:00 Andreas Tille : > >> Via salsa, though, well, not. Should lintian invoke yamllint, possibly? > > As far as I know syntax errors in upstream files are fetched by lintian. > Its the downside of just using the not yet uploaded commits as data > input that we do not run lintian on these. > Currently, the lintian checks for integrity of d/u/metadata are disabled due to some security problem [1] but maybe should be re-enabled soon [2]. Best, Dylan [1] https://anonscm.debian.org/git/lintian/lintian.git/commit/checks/upstream-metadata.pm?id=6119d49c3b [2] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=862373
Re: RRID update on salsa on packages starting with A+B
On Wed, Mar 28, 2018 at 02:10:33PM +0200, Steffen Möller wrote: > > > Checking UDD log > > > > blends_prospective_gatherer.log:2018-03-27 08:32:05,582 - ERROR - (110): > > Scanner error in file > > /srv/udd.debian.org/mirrors/machine-readable/a/artemis.upstream of > > debian-med: mapping values are not allowed here > > blends_prospective_gatherer.log: Entry: artemis > > Excellent. Thank you (and yamllint) for spotting that. You are welcome. > Concerning yamllint I am a routine user whenever I edit via the command > line. And I also keep fixing the long lines. Nice. :-) > Via salsa, though, well, not. Should lintian invoke yamllint, possibly? As far as I know syntax errors in upstream files are fetched by lintian. Its the downside of just using the not yet uploaded commits as data input that we do not run lintian on these. Kind regards Andreas. -- http://fam-tille.de
Re: RRID update on salsa on packages starting with A+B
On 3/28/18 1:14 PM, Andreas Tille wrote: On Wed, Mar 28, 2018 at 12:03:54PM +0200, Steffen Möller wrote: But I found artemis (https://salsa.debian.org/med-team/artemis/tree/master/debian/upstream) which was not touched for a while and looks syntactically just fine but has its RRIDs not shown. Checking UDD log blends_prospective_gatherer.log:2018-03-27 08:32:05,582 - ERROR - (110): Scanner error in file /srv/udd.debian.org/mirrors/machine-readable/a/artemis.upstream of debian-med: mapping values are not allowed here blends_prospective_gatherer.log: Entry: artemis Excellent. Thank you (and yamllint) for spotting that. Concerning yamllint I am a routine user whenever I edit via the command line. And I also keep fixing the long lines. Via salsa, though, well, not. Should lintian invoke yamllint, possibly? Best, Steffen
Re: RRID update on salsa on packages starting with A+B
On Wed, Mar 28, 2018 at 12:03:54PM +0200, Steffen Möller wrote: > But I found artemis > (https://salsa.debian.org/med-team/artemis/tree/master/debian/upstream) > which was not touched for a while and looks syntactically just fine but has > its RRIDs not shown. Checking UDD log blends_prospective_gatherer.log:2018-03-27 08:32:05,582 - ERROR - (110): Scanner error in file /srv/udd.debian.org/mirrors/machine-readable/a/artemis.upstream of debian-med: mapping values are not allowed here blends_prospective_gatherer.log: Entry: artemis Checking package: $ yamllint debian/upstream/metadata debian/upstream/metadata 1:1 warning missing document start "---" (document-start) 2:81 errorline too long (111 > 80 characters) (line-length) 2:111 errortrailing spaces (trailing-spaces) 3:81 errorline too long (126 > 80 characters) (line-length) 12:81 errorline too long (82 > 80 characters) (line-length) 13:81 errorline too long (139 > 80 characters) (line-length) 23:81 errorline too long (83 > 80 characters) (line-length) 24:81 errorline too long (164 > 80 characters) (line-length) 25:81 errorline too long (104 > 80 characters) (line-length) 34:81 errorline too long (84 > 80 characters) (line-length) 41:9 errorsyntax error: mapping values are not allowed here Well, yamllint is a bit picky but the mapping values error matches the UDD importer. So lets have a look: $ git diff diff --git a/debian/upstream/metadata b/debian/upstream/metadata index 4bede0a..a9906a8 100644 --- a/debian/upstream/metadata +++ b/debian/upstream/metadata @@ -37,5 +37,5 @@ Registry: Entry: SCR_004267 - Name: OMICtools Entry: OMICS_00903 - - Name; bio.tools + - Name: bio.tools Entry: artemis ... voila, the bug in question vanished. > Please also have an eye on clonalframe > (https://salsa.debian.org/med-team/clonalframe/tree/master/debian/upstream) > which was not updated even though the other changes of mine seem to be all > in. Same here. Importer says: blends_prospective_gatherer.log:2018-03-27 08:34:50,809 - ERROR - (110): Scanner error in file /srv/udd.debian.org/mirrors/machine-readable/c/clonalframe.upstream of debian-med: mapping values are not allowed here $ yamllint debian/upstream/metadata debian/upstream/metadata 1:1 warning missing document start "---" (document-start) 21:10 errorsyntax error: mapping values are not allowed here $ git diff diff --git a/debian/upstream/metadata b/debian/upstream/metadata index 79c7b56..98566d5 100644 --- a/debian/upstream/metadata +++ b/debian/upstream/metadata @@ -17,5 +17,5 @@ Registry: Entry: SCR_016060 - Name: bio.tools Entry: NA - - Name. OMICtools + - Name: OMICtools Entry: NA So please use yamllint on all your updates - if not on all at least those that are suspicious since not showing up on the sentinel page. It seems just editing in Salsa online is a bit error prone. >From time to time (every half year or so) I'm doing some QA on the UDD log but inbetween I will not notice those things. Here is the full list of yaml issues in upstream files: $ grep "ERROR.*\.upstream of" blends_prospective_gatherer.log 2018-03-27 08:32:05,582 - ERROR - (110): Scanner error in file /srv/udd.debian.org/mirrors/machine-readable/a/artemis.upstream of debian-med: mapping values are not allowed here 2018-03-27 08:33:15,614 - ERROR - (110): Scanner error in file /srv/udd.debian.org/mirrors/machine-readable/o/octave-stk.upstream of pkg-octave: mapping values are not allowed here 2018-03-27 08:33:17,622 - ERROR - (110): Scanner error in file /srv/udd.debian.org/mirrors/machine-readable/o/octave-divand.upstream of pkg-octave: mapping values are not allowed here 2018-03-27 08:34:00,876 - ERROR - (110): Scanner error in file /srv/udd.debian.org/mirrors/machine-readable/r/rapmap.upstream of debian-med: mapping values are not allowed here 2018-03-27 08:34:50,809 - ERROR - (110): Scanner error in file /srv/udd.debian.org/mirrors/machine-readable/c/clonalframe.upstream of debian-med: mapping values are not allowed here > I'll wait for tomorrow to indicate the others. Anyway, good to hear that no > extra upload is required. I can confirm that for this kind of data no upload was required for years. Kind regards Andreas. -- http://fam-tille.de
Re: RRID update on salsa on packages starting with A+B
Hi Andreas, I'll get to the other points of your fine reply a bit later. The easy ones first On 3/27/18 9:04 AM, Andreas Tille wrote: Anyway. I came across * one or two entries Which ones? I had thought this would be dead easy to answer and then I added quite a few more references since indeed there was no RRID assigned, yet. Hm. Did I sometimes close the salsa editing window prior to pushing the edit? But I found artemis (https://salsa.debian.org/med-team/artemis/tree/master/debian/upstream) which was not touched for a while and looks syntactically just fine but has its RRIDs not shown. Please also have an eye on clonalframe (https://salsa.debian.org/med-team/clonalframe/tree/master/debian/upstream) which was not updated even though the other changes of mine seem to be all in. I'll wait for tomorrow to indicate the others. Anyway, good to hear that no extra upload is required. Steffen
Re: RRID update on salsa on packages starting with A+B
Hi Steffen, On Mon, Mar 26, 2018 at 07:23:24PM +0200, Steffen Möller wrote: > > I just procrastinated a bit into using the comfort of salsa to update > debian/upstream/metadata and here the references to SciCrunch, OMICtools and > bio.tools registries. All three registries have improved their coverage > enormously over the past few months. I am deeply impressed. Thanks a lot for the large update. > Anyway. I came across > > * one or two entries Which ones? > that had perfect RRID descriptions on salsa but not on > our task page - does the package need to be re-uploaded for the change to > become visible? Re-uploading is *not* needed. The data come from Salsa Git repositories (since about two weeks the machine-readable gatherer was pointed from Alioth to Salsa). However, there is an about 24 hour delay between commits and visibility of the data on the web sentinel since at least two cron jobs are involved (one that gathers the data and one that creates the pages). > * belvu and blixem that are from the same source package but have different > task entries and also separate catalog entries in all three registries. This > breaks the current UDD schema. I have annotated it now as ['belvu','blixem'] > (for bio.tools, the others analogously). > > Ideas for improvements anyone? Or is this how it should be for now? I'm not sure. In any case the current gatherer code will do nothing (at best) or fail. It seems that we are lucky and it does not break. The thing is that if we change our data model somebody (currently only me) needs to adapt the code. Currently there is no chance to resolve - Name: OMICtools Entry: ['OMICS_23183', 'OMICS_23184', 'OMICS_15828'] or - Name: SciCrunch Entry: ['SCR_015989','SCR_015994', 'NA'] How should the gatherer magically guess what binary package to choose? The entry - Name: bio.tools Entry: ['belvu', 'blixem', 'dotter'] looks helpfull - but it is just pure luck that bio.tools has choosen IDs matching our package names. So I think your data model is not helpful since there is no chance to define a sequence of the binary packages build from one source package. Thus we somehow need to define the binary package name explicitly. For citations we are using the field Debian-package[1] which is for instance used for meme package[2] (just to have another example since in seqtools also the dotter publication is marked like this). However, this is because I once added an additional field "package" to the bibref table which looks for instance like this: udd=# select * from bibref where (source = 'meme' or source = 'seqtools' ) and key = 'title'; source | key | value | package | rank --+---+--+-+-- meme | title | MEME: discovering and analyzing DNA and protein sequence motifs | |0 meme | title | Discovering Sequence Motifs with Arbitrary Insertions and Deletions | glam2 |0 seqtools | title | SeqTools: visual tools for manual analysis of sequence alignments| |0 seqtools | title | Scoredist: A simple and robust protein sequence distance estimator | |1 seqtools | title | A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis | dotter |0 seqtools | title | A workbench for large-scale sequence homology analysis | |2 You see, packages with different names than the source packages got an additional value in the package column since it was defined in our data model first and implemented in the code afterwards. However, the registry table looks like this: udd=# select * from registry where source = 'seqtools'; source | name| entry --+---+--- seqtools | OMICtools | {OMICS_23183,OMICS_23184} seqtools | bio.tools | {belvu,blixem} seqtools | SciCrunch | {SCR_015989,SCR_015994} That's the status before your last commit since the machine-readable gatherer cron job was not run yet. The gatherer takes what it gets and injects it into the database. Its not magic - its code that needs to be adapted to a data model. Changing the data model and hoping that something sensible will happen is not working. What we should clarify in advance is: Does the source column in the registry table make sense at all or should it rather be a package column refering to binary packages? The web sentinel is working on binary packages so may be we should not keep source package names but rather binary package names inside this table. Alternatively w