Re: RRID update on salsa on packages starting with A+B

2018-04-09 Thread Andreas Tille
On Fri, Apr 06, 2018 at 02:22:40PM +0200, Steffen Möller wrote:
> Whenever you find the moment, please review the presentation of
> 
> rate4site

$ yamllint metadata 
metadata
  1:1   warning  missing document start "---"  (document-start)
  7:8   errortrailing spaces  (trailing-spaces)
  10:2  errorsyntax error: could not find expected ':'


> sambamba

$ git diff
diff --git a/debian/upstream/metadata b/debian/upstream/metadata
index ee8300c..b4afe9d 100644
--- a/debian/upstream/metadata
+++ b/debian/upstream/metadata
@@ -14,7 +14,7 @@ Reference:
   10.1093/bioinformatics/btv098"
  eprint: "https://academic.oup.com/bioinformatics/article-pdf/\
   31/12/2032/568750/btv098.pdf"
-Registries:
+Registry:
  - Name: bio.tools
Entry: Sambamba
  - Name: OMICtools

> on the task page which do not show their references.

I would love if someone else than me would take over proof-reading this
kind of issues before the code is suspected to be wrong (= I will not
check the other two packages of your other mail right now).  Thanks for
anybody who helps in enhancing metadata but I would love to concentrate
on bug fixing of packages.

Thank you

  Andreas.

-- 
http://fam-tille.de



Re: RRID update on salsa on packages starting with A+B

2018-04-08 Thread Steffen Möller


On 4/6/18 2:22 PM, Steffen Möller wrote:

Hi Andreas,

Whenever you find the moment, please review the presentation of

rate4site
sambamba


genometools
maffilter



on the task page which do not show their references.

Cheers,

Steffen





Re: RRID update on salsa on packages starting with A+B

2018-04-06 Thread Steffen Möller

Hi Andreas,

Whenever you find the moment, please review the presentation of

rate4site
sambamba

on the task page which do not show their references.

Cheers,

Steffen



Re: d/u/metadata for binary or source tree Re: RRID update on salsa on packages starting with A+B

2018-04-04 Thread Andreas Tille
On Wed, Apr 04, 2018 at 01:56:42PM +0200, Steffen Möller wrote:
> 
> > https://blends.debian.org/blends/apa.html#datagathering
> > 
> > (A.7.) could be a first shot on this problem.
> One half done. The invitation to contribute to salsa

I have no idea what you might consider *Blend* *specific* in using
Salsa.  Everything worked before Salsa and the fact that Git
repositories now can be changed via web form has nothing to do with
Blends.  Am I missing something?

> and the screenshots and translations is missing.

That's perfectly generic Debian stuff.  Well, I admit I wrote the
UDD importers for screenshots and translations because I intended
to use it on the tasks pages.  But it is used in very different
places for all Debian packages.  So what exactly do you want to be
documented here?  (Patches welcome)

> > That's not about branches.  As previously said:  Registry data are per
> > source package currently.  There is no means in the registry data table
> > to map an entry to a binary package.  Thus autodock *and* autogrid are
> > missing registry data both.
> 
> There was one d/u/metadata file that assigned different references IIRC to
> different binaries of that source package by inventing a sub-hierarchy.

Can you please re-read

   https://lists.debian.org/debian-med/2018/03/msg00119.html

Seek for "For citations we are using the field Debian-package"

> This
> seems to indicate that this is not only an issue for the registry entries.
> How about the following:
> 
>  * d/u/metadata always refers to the source tree as a whole

That's the case.

>  * d/u/metadata also refers to the binary with the same name as the source
> tree.

That's also the case (more or less unintended but it is working like
this).
 
>  * information in d/u/metadata associated with a particular set of binaries
> with the same or different names shall be listed in a "binary: " attribute

See my posting.
 
> Admittedly, I see some issues with that. The notion of a reference binary
> for a package with many binaries would be helpful for Debian in general.
> This would prevent something like the Massxpert entry on our task list that
> describes itself as a transition package.

Hmmm, I do not understand what you mean.  Fixing the massxpert package
is only one commit away:

   
https://salsa.debian.org/blends-team/med/commit/6451da2d816f762cf391f048676c79eee1c25431

Please, if anybody notices this, just fix the according task instead of
assuming intentional displaying cruft.  Its not *that* hard to do to
replace an outdated package name by a correct one.

> But this also means that such
> notion about, say, "end-user relevant" packages vs technical helper packages
> because of arch-independency or libraries etc, should be declared in
> d/control.

I guess you want to invent some extra layer of complexity for d/control
which will never be accepted to solve a problem for Registry entries
which was just solved for citations.  We just need a *decision* I was
asking for in the end of the mail I was linking to above.
 
> Another concern of mine is that we typically do not tag any data for being
> specific for something. We put it in different files. As such, we would need
> d/u/.metadata.

Having data relevant for *binary* packages in a dir named *upstream*
does not sound very logical for me.
 
> For d/u/edam we had the concept of a summary representing the packages as a
> whole and then subsections for every binary.

I admit I do not fully understand the edam data.  I guess the only
reason that we are not facing issues with these data is that they are
just stored in UDD and the only (non-)use case is some query that
exports the data again.  I doubt that anybody has done some QA on the
output.

> I kind of liked it the way it
> is, but maybe separating that out to different files would also be
> appropriate. Could there be an option to do it either way?

Again:  I think that what we implemented for citations (Debian-Package
field) which is documented in Wiki[1] is absolutely sufficient to solve
the problem we have.  I see no reason to invent something else.  The
only thing that we should think about is the data storage model:  Do we
use another column as in the bibref table or should the importer
translate the data right to binary packages according to the algorithm
above:

   If source package = binary package if nothing else is specified.
   Use Debian-Package as binary package name otherwise.

It depends a bit from the applications that will consume that data.  For
the moment it is only the Blends web sentinel which is rather binary
package centric.  The bibref table layout is rather a hack I created
afterwards after we were running in the perfectly same problem as we
have now with references.  May be a binary package based table layout
is more sensible.

> I cannot judge how much of a hassle it is to fiddle anything like that into
> the UDD. What do you think?

I keep on thinking that the established method is not much of a hassle

Re: RRID update on salsa on packages starting with A+B

2018-04-04 Thread Steffen Möller

Hi Tony,

On 4/3/18 1:59 PM, Tony Travis wrote:

On 03/04/18 11:27, Steffen Möller wrote:

[...]
This is a bit of a side-track from the core of this thread. Maybe we
would have Tony describing our achievements one they are manifesting in
Bio-Linux. And if we have a look at
https://salsa.debian.org/dashboard/milestones ? Maybe we can use those
as an anchor for achievements from which news could be generated in an
easier way, not necessarily by us.

[...]

Yes, I would be happy to do that.


Yippee.

I see five bits, there may be more:

A) New bits for Bio-Linux that its new release features

B) New bits for Bio-Linux that shall be coming at some future point

C) New bits for Debian and its derivative distributions that have been 
established


D) New bits for Debian and its derivative distributions that are 
currently being worked on


E) New bits for Debian and its derivatives that are currently being 
discussed on the Sprint(s) or mailing list(s).


In an ideal world we would see a lot moving from E to A and from B to E, 
so the effort to describe this all would be mostly done only once :) I 
know, it will not work, let alone because the audiences are different.


In my PoV there is not too much of a conceptional difference between 
porting to Bio-Linux and having backports within Debian. We need to 
address both. Just who should do all that.


Since Debian Med takes about every opportunity that it is just a regular 
part of Debian, we somehow do not have any home page since there is the 
Debian home page already. As such there is also no such thing like a 
page where to distribute any Debian Med-specific news. Closest is possibly


https://blends.debian.org/med/ and http://debian-med.alioth.debian.org/ .

Nobody truly wants to read through our Wiki page on 
https://wiki.debian.org/DebianMed .


We have http://debianmed.blogspot.de/ with latest news from 2014.

So, I have no immediate idea about how to improve on it all.



I'm now only doing minimal maintenance work on Bio-Linux 8, based on
Ubuntu 14.04 LTS (Trusty) while I start work on Bio-Linux 9, based on
Ubuntu-MATE 18.04 LTS (Bionic). I've got the 18.04 Ubuntu-MATE beta .iso
and I've started work remastering it for Bio-Linux using "customizer":


https://github.com/kamilion/customizer

I cannot judge. The readme reads fine.

I plan to base Bio-Linux 9 on Debian-Med + Bioconda


That is good. Please consider to also install the singularity and docker
clients. And the cwltool.


and would like to
start adding Ubuntu 18.04 (Bionic) versions of Debian-Med packages to
the Debian-Med PPA if nobody objects. I also want to drop ALL the 'bad'
NEBC packages (i.e. the binary-only packages that Bela and Tim used
initially to migrate from Debian testing (Sarge) to Ubuntu 6.06 LTS.

Yip.

I've listened to Andreas: I intend to create a "bio-linux-desktop"
meta-package modelled on the Debian-Med/Bio* tasks that are now being
updated to include some 'missing' packages that were only in Bio-Linux.


We should think about adding the version in Ubuntu and Bio-Linux to show 
up among the versions in the task pages. I would not know how to 
implement that, admittedly.


Best,

Steffen







d/u/metadata for binary or source tree Re: RRID update on salsa on packages starting with A+B

2018-04-04 Thread Steffen Möller


On 4/3/18 5:09 PM, Andreas Tille wrote:

On Tue, Apr 03, 2018 at 12:27:08PM +0200, Steffen Möller wrote:

There is a daily cron job parsing Salsa directories.

Fine. Somewhere there is (or should be :o) ) a documentation how this page
is crafted. On our Wiki? Let us then have a link to that page.

May be this

https://blends.debian.org/blends/apa.html#datagathering

(A.7.) could be a first shot on this problem.
One half done. The invitation to contribute to salsa and the screenshots 
and translations is missing.



Could branches for your cron job's autodock checkout differ? The page was
updated
this morning but yet not references. Or is there a second directory
"autodock" when
the source package name is "autodocksuite" (because of the joint autogrid
tool)?

That's not about branches.  As previously said:  Registry data are per
source package currently.  There is no means in the registry data table
to map an entry to a binary package.  Thus autodock *and* autogrid are
missing registry data both.


There was one d/u/metadata file that assigned different references IIRC 
to different binaries of that source package by inventing a 
sub-hierarchy. This seems to indicate that this is not only an issue for 
the registry entries. How about the following:


 * d/u/metadata always refers to the source tree as a whole

 * d/u/metadata also refers to the binary with the same name as the 
source tree.


 * information in d/u/metadata associated with a particular set of 
binaries with the same or different names shall be listed in a "binary: 
" attribute


Admittedly, I see some issues with that. The notion of a reference 
binary for a package with many binaries would be helpful for Debian in 
general. This would prevent something like the Massxpert entry on our 
task list that describes itself as a transition package. But this also 
means that such notion about, say, "end-user relevant" packages vs 
technical helper packages because of arch-independency or libraries etc, 
should be declared in d/control.


Another concern of mine is that we typically do not tag any data for 
being specific for something. We put it in different files. As such, we 
would need d/u/.metadata.


For d/u/edam we had the concept of a summary representing the packages 
as a whole and then subsections for every binary. I kind of liked it the 
way it is, but maybe separating that out to different files would also 
be appropriate. Could there be an option to do it either way?


I cannot judge how much of a hassle it is to fiddle anything like that 
into the UDD. What do you think?


Best,

Steffen




Re: RRID update on salsa on packages starting with A+B

2018-04-03 Thread Andreas Tille
Hi Tony,

On Tue, Apr 03, 2018 at 12:59:46PM +0100, Tony Travis wrote:
> > [...]
> > This is a bit of a side-track from the core of this thread. Maybe we
> > would have Tony describing our achievements one they are manifesting in
> > Bio-Linux. And if we have a look at
> > https://salsa.debian.org/dashboard/milestones ? Maybe we can use those
> > as an anchor for achievements from which news could be generated in an
> > easier way, not necessarily by us.
> 
> Yes, I would be happy to do that.

I admit I do not mind about some proposed technique.  I think the person
who is doing the actual publicity work should freely choose the channel
for communication.
 
> I plan to base Bio-Linux 9 on Debian-Med + Bioconda and would like to
> start adding Ubuntu 18.04 (Bionic) versions of Debian-Med packages to
> the Debian-Med PPA if nobody objects. I also want to drop ALL the 'bad'
> NEBC packages (i.e. the binary-only packages that Bela and Tim used
> initially to migrate from Debian testing (Sarge) to Ubuntu 6.06 LTS.

Please let us know if we should enforce packaging efforts for certain
packages you consider very important.  As I have seen for some older
packages the licencing is frequently the only blocker, technical issues
are easy to solve.
 
> I've listened to Andreas: I intend to create a "bio-linux-desktop"
> meta-package modelled on the Debian-Med/Bio* tasks that are now being
> updated to include some 'missing' packages that were only in Bio-Linux.

Feel free to discuss this here.
 
> Thanks again to all in the Debian-Med team for your help and support,

You are welcome

Andreas. 

-- 
http://fam-tille.de



Re: RRID update on salsa on packages starting with A+B

2018-04-03 Thread Andreas Tille
On Tue, Apr 03, 2018 at 12:27:08PM +0200, Steffen Möller wrote:
> > There is a daily cron job parsing Salsa directories.
> Fine. Somewhere there is (or should be :o) ) a documentation how this page
> is crafted. On our Wiki? Let us then have a link to that page.

May be this

   https://blends.debian.org/blends/apa.html#datagathering

(A.7.) could be a first shot on this problem.

> Could branches for your cron job's autodock checkout differ? The page was
> updated
> this morning but yet not references. Or is there a second directory
> "autodock" when
> the source package name is "autodocksuite" (because of the joint autogrid
> tool)?

That's not about branches.  As previously said:  Registry data are per
source package currently.  There is no means in the registry data table
to map an entry to a binary package.  Thus autodock *and* autogrid are
missing registry data both.
 
Kind regards

  Andreas. 

-- 
http://fam-tille.de



Re: RRID update on salsa on packages starting with A+B

2018-04-03 Thread Tony Travis
On 03/04/18 11:27, Steffen Möller wrote:
> [...]
> This is a bit of a side-track from the core of this thread. Maybe we
> would have Tony describing our achievements one they are manifesting in
> Bio-Linux. And if we have a look at
> https://salsa.debian.org/dashboard/milestones ? Maybe we can use those
> as an anchor for achievements from which news could be generated in an
> easier way, not necessarily by us.

Hi, Steffen.

Yes, I would be happy to do that.

I'm now only doing minimal maintenance work on Bio-Linux 8, based on
Ubuntu 14.04 LTS (Trusty) while I start work on Bio-Linux 9, based on
Ubuntu-MATE 18.04 LTS (Bionic). I've got the 18.04 Ubuntu-MATE beta .iso
and I've started work remastering it for Bio-Linux using "customizer":

> https://github.com/kamilion/customizer

I plan to base Bio-Linux 9 on Debian-Med + Bioconda and would like to
start adding Ubuntu 18.04 (Bionic) versions of Debian-Med packages to
the Debian-Med PPA if nobody objects. I also want to drop ALL the 'bad'
NEBC packages (i.e. the binary-only packages that Bela and Tim used
initially to migrate from Debian testing (Sarge) to Ubuntu 6.06 LTS.

I've listened to Andreas: I intend to create a "bio-linux-desktop"
meta-package modelled on the Debian-Med/Bio* tasks that are now being
updated to include some 'missing' packages that were only in Bio-Linux.

Thanks again to all in the Debian-Med team for your help and support,

  Tony.

-- 
Minke Informatics Limited, Registered in Scotland - Company No. SC419028
Registered Office: 3 Donview, Bridge of Alford, AB33 8QJ, Scotland (UK)
tel. +44(0)19755 63548http://minke-informatics.co.uk
mob. +44(0)7985 078324mailto:tony.tra...@minke-informatics.co.uk



Re: RRID update on salsa on packages starting with A+B

2018-04-03 Thread Steffen Möller


On 4/3/18 10:30 AM, Andreas Tille wrote:

On Tue, Apr 03, 2018 at 12:14:57AM +0200, Steffen Möller wrote:

Registry:
   - Name: OMICtools
 Entry: NA
   - Name: RRID
 Entry: NA
   - Name: bio.tools
 Entry: NA

That is an interesting one. Please kindly check

https://salsa.debian.org/med-team/autodocksuite/blob/master/debian/upstream/metadata

which on my side shows

Registry:
  - Name: OMICtools
    Entry: OMICS_19997
  - Name: SciCrunch
    Entry: SCR_012746
  - Name: bio.tools
    Entry: AutoDock

May be you forgot to push.  I now received

commit e5292af2136df30df8ea2a0da0dbeba6b82b027b (HEAD -> master, origin/master)
Author: Steffen Möller 
Date:   Mon Mar 26 16:05:33 2018 +

 Added RRID to metadata

which has the values you are mentioning.  This was not available in the public
repository (at least until 1st of April).

Hm. No. Have not touched it again. May be some delay-thingy somewhere.

I remember it was not your prefered solution but for the moment 'NA'
values are not stored. [...]

How would you like Name and no fancy colouring?

The idea was to inform the world (and ourselves) that we have checked.

I admit I do not like this.  The thing is that the maintainers of the
repository could have updated their data.  Recent development shows that
this is a very probable action these days.  I do not think that the fact
that we have checked is no information which is valuable for a random
visitor of the tasks page and it might be simply wrong since the data
were updated.  So I keep on thinking that 'NA' is not anything
interesting for the page.  For us as developers we can easily do

 grep -w 'NA' */debian/upstream/metadata

if we want to find the information.


Right. If we developers have the whole of Debian Med checked out. I don't.
It is a side-issue. Don't worry.


The
problem is that we are unlikely to be informed about an entry being added to
the registry, so we would look bad. But since the advent of salsa.d.o I am
tempted to risk that.

Sorry, I do not understand this.


We have that mistake only for a few days upon notification.


  

bio-express (https://salsa.debian.org/med-team/express)

It is rather berkeley-express.  This works now[1].  The reason for the
problem was that the repository name is different from the source
package name.  The importer was simply wrong for these cases.  This is
fixed now (hopefully!) but we should definitely avoid this kind of
divergence and I'm tempted to adapt the repository name to the source
package name once we migrate anonscm.d.o to salsa.d.o (I've lost hope
for some sensible solution to keep anonscm working :-((().

Aaah. Yes. This sounds like a reasonable addition to the Debian Med policy.

We are intuitively doing it in close to all of our packages but
specifying it explicitly in policy makes sense, definitely.


gromacs

No Registry entries in d/upstream/metadata - so nothing to display here.

:o/ Because that is debichem and looked at my not-yet-merged own branch.

Why not asking for membership in Debichem and commit directly?


I was with debichem on alioth and eventually will ask for membership again.
However, I would loose the external view on how we present ourselves. And I
am much after attracting casual contributors to our cause, e.g. to improve
the description of our packages or ... registry entries in d/u/metadata. So,
debichem is a test-case to teach me how this feels on the other side.





This document is prepared every 24 hours for Debian packages selected in the
https://salsa.debian.org/blends-team/med/blob/master/tasks/bio";>Med
Bio Task Description withinformation in the Ultimate Debian Database (https://udd.debian.org";>UDD).

The bottom line has a creation date of the web page.  Currently it says:

  Last update: Mon, 02 Apr 2018 19:21:37 -

Excellent. I have missed that one.

It was last updated at [Date+Time].

This information is not available, sorry.

I am uncertain about what "this" refers to, but maybe you have some idea
about who the page can explain itself to the ones who want to contribute
code or content.

There is a daily cron job parsing Salsa directories.

Fine. Somewhere there is (or should be :o) ) a documentation how this page
is crafted. On our Wiki? Let us then have a link to that page. Casual 
contributors,
that is what I am primarily after. For everything else I happily ask you 
or Ole.

I do not keep
track on the information when this job is finished.  The UDD importer is
reading the result of this job later and I also do not keep track when
this job ends.  So we have two uncertain times - the only time that can
be safely displayed is when the web pages are displayed which is done on
the bottom line.  I admit that I'm not very motivated to make some
effort to keep track of the other times compared to other tasks on my
desk.


That is fine.

There could be a line like "Please allow 48 hours for your change to the
package selection or the package description to have an effect on th

Re: RRID update on salsa on packages starting with A+B

2018-04-03 Thread Andreas Tille
On Tue, Apr 03, 2018 at 12:14:57AM +0200, Steffen Möller wrote:
> 
> > Registry:
> >   - Name: OMICtools
> > Entry: NA
> >   - Name: RRID
> > Entry: NA
> >   - Name: bio.tools
> > Entry: NA
> 
> That is an interesting one. Please kindly check
> 
> https://salsa.debian.org/med-team/autodocksuite/blob/master/debian/upstream/metadata
> 
> which on my side shows
> 
> Registry:
>  - Name: OMICtools
>    Entry: OMICS_19997
>  - Name: SciCrunch
>    Entry: SCR_012746
>  - Name: bio.tools
>    Entry: AutoDock

May be you forgot to push.  I now received

commit e5292af2136df30df8ea2a0da0dbeba6b82b027b (HEAD -> master, origin/master)
Author: Steffen Möller 
Date:   Mon Mar 26 16:05:33 2018 +

Added RRID to metadata

which has the values you are mentioning.  This was not available in the public
repository (at least until 1st of April).
 
> > I remember it was not your prefered solution but for the moment 'NA'
> > values are not stored. [...]
> How would you like Name and no fancy colouring?
> 
> The idea was to inform the world (and ourselves) that we have checked.

I admit I do not like this.  The thing is that the maintainers of the
repository could have updated their data.  Recent development shows that
this is a very probable action these days.  I do not think that the fact
that we have checked is no information which is valuable for a random
visitor of the tasks page and it might be simply wrong since the data
were updated.  So I keep on thinking that 'NA' is not anything
interesting for the page.  For us as developers we can easily do

grep -w 'NA' */debian/upstream/metadata

if we want to find the information.

> The
> problem is that we are unlikely to be informed about an entry being added to
> the registry, so we would look bad. But since the advent of salsa.d.o I am
> tempted to risk that.

Sorry, I do not understand this.
 
> > > bio-express (https://salsa.debian.org/med-team/express)
> > It is rather berkeley-express.  This works now[1].  The reason for the
> > problem was that the repository name is different from the source
> > package name.  The importer was simply wrong for these cases.  This is
> > fixed now (hopefully!) but we should definitely avoid this kind of
> > divergence and I'm tempted to adapt the repository name to the source
> > package name once we migrate anonscm.d.o to salsa.d.o (I've lost hope
> > for some sensible solution to keep anonscm working :-((().
> Aaah. Yes. This sounds like a reasonable addition to the Debian Med policy.

We are intuitively doing it in close to all of our packages but
specifying it explicitly in policy makes sense, definitely.

> > > gromacs
> > No Registry entries in d/upstream/metadata - so nothing to display here.
> :o/ Because that is debichem and looked at my not-yet-merged own branch.

Why not asking for membership in Debichem and commit directly?

> > > This document is prepared every 24 hours for Debian packages selected in 
> > > the
> > >  > > href="https://salsa.debian.org/blends-team/med/blob/master/tasks/bio";>Med
> > > Bio Task Description withinformation in the Ultimate Debian Database 
> > > ( > > href="https://udd.debian.org";>UDD).
> > The bottom line has a creation date of the web page.  Currently it says:
> > 
> >  Last update: Mon, 02 Apr 2018 19:21:37 -
> Excellent. I have missed that one.

> > > It was last updated at [Date+Time].
> > This information is not available, sorry.
> 
> I am uncertain about what "this" refers to, but maybe you have some idea
> about who the page can explain itself to the ones who want to contribute
> code or content.

There is a daily cron job parsing Salsa directories.  I do not keep
track on the information when this job is finished.  The UDD importer is
reading the result of this job later and I also do not keep track when
this job ends.  So we have two uncertain times - the only time that can
be safely displayed is when the web pages are displayed which is done on
the bottom line.  I admit that I'm not very motivated to make some
effort to keep track of the other times compared to other tasks on my
desk.
 
> > There is no difference between pushing from remote repository and a web
> > interface edit.
> 
> Good.
> 
> Curious about this autodock/autodocksuite thingy and maybe there is some
> chance to have this self-explaining bit at the end.

I have no idea how this can happen.  I do not use the web interface for
editing package data.
 
> I have added registry links to about all green entries in bio now. Let us
> review those over the next weeks a bit and then think about an announcing it
> with the Debian News or whatever the list finds appropriate.

I'd love if we could find some default channel for Debian Med news (and
a person who feels dedicated to feed this channel regularly). 

Kind regards

   Andreas.

-- 
http://fam-tille.de



Re: RRID update on salsa on packages starting with A+B

2018-04-02 Thread Steffen Möller


On 4/2/18 11:32 PM, Andreas Tille wrote:

Hi Steffen,

thanks a lot for your continuous work on assembling registry data.

On Sat, Mar 31, 2018 at 02:41:42AM +0200, Steffen Möller wrote:

There are still packages that are not updated. Worthwhile candidates
are

autodock

This has:

Registry:
  - Name: OMICtools
Entry: NA
  - Name: RRID
Entry: NA
  - Name: bio.tools
Entry: NA


That is an interesting one. Please kindly check

https://salsa.debian.org/med-team/autodocksuite/blob/master/debian/upstream/metadata

which on my side shows

Registry:
 - Name: OMICtools
   Entry: OMICS_19997
 - Name: SciCrunch
   Entry: SCR_012746
 - Name: bio.tools
   Entry: AutoDock



I remember it was not your prefered solution but for the moment 'NA'
values are not stored. [...]

How would you like Name and no fancy colouring?

The idea was to inform the world (and ourselves) that we have checked. 
The problem is that we are unlikely to be informed about an entry being 
added to the registry, so we would look bad. But since the advent of 
salsa.d.o I am tempted to risk that.



bio-express (https://salsa.debian.org/med-team/express)

It is rather berkeley-express.  This works now[1].  The reason for the
problem was that the repository name is different from the source
package name.  The importer was simply wrong for these cases.  This is
fixed now (hopefully!) but we should definitely avoid this kind of
divergence and I'm tempted to adapt the repository name to the source
package name once we migrate anonscm.d.o to salsa.d.o (I've lost hope
for some sensible solution to keep anonscm working :-((().

Aaah. Yes. This sounds like a reasonable addition to the Debian Med policy.



Bio-eagle

Works now[2].  Same as above.

Thanks.



garli

Works (and as far as I can see it has also worked before - no idea why
it is on your list).

Most likely I did not wait long enough prior to reporting.

gdpc

Same here - just works.


gromacs

No Registry entries in d/upstream/metadata - so nothing to display here.

:o/ Because that is debichem and looked at my not-yet-merged own branch.

a problem for me to report them is that I am uncertain about the time that
needs
to be passed until the update is run. Could you please help with a little
line at the bottom alike:

This document is prepared every 24 hours for Debian packages selected in the
https://salsa.debian.org/blends-team/med/blob/master/tasks/bio";>Med
Bio Task Description withinformation in the Ultimate Debian Database (https://udd.debian.org";>UDD).

The bottom line has a creation date of the web page.  Currently it says:

 Last update: Mon, 02 Apr 2018 19:21:37 -

Excellent. I have missed that one.

It was last updated at [Date+Time].

This information is not available, sorry.


I am uncertain about what "this" refers to, but maybe you have some idea 
about who the page can explain itself to the ones who want to contribute 
code or content.






The good part of the problem is that whenever I check if something was
updated, I'll add RRIDs to another package. This typically is from within
salsa. Is that possibly not triggering the same routines as when I push from
a remote-from-salsa repository?

There is no difference between pushing from remote repository and a web
interface edit.


Good.

Curious about this autodock/autodocksuite thingy and maybe there is some 
chance to have this self-explaining bit at the end.


I have added registry links to about all green entries in bio now. Let 
us review those over the next weeks a bit and then think about an 
announcing it with the Debian News or whatever the list finds appropriate.


Best,

Steffen




Re: RRID update on salsa on packages starting with A+B

2018-04-02 Thread Andreas Tille
Hi Steffen,

thanks a lot for your continuous work on assembling registry data.

On Sat, Mar 31, 2018 at 02:41:42AM +0200, Steffen Möller wrote:
> There are still packages that are not updated. Worthwhile candidates
> are
> 
> autodock

This has:

Registry:
 - Name: OMICtools
   Entry: NA
 - Name: RRID
   Entry: NA
 - Name: bio.tools
   Entry: NA

I remember it was not your prefered solution but for the moment 'NA'
values are not stored.  The importer log says:

2018-04-01 14:57:13,012 - DEBUG - (125): Registry data found for source 
'autodocktools' of debian-med: registry = [{'Entry': 'NA', 'Name': 
'OMICtools'}, {'Entry': 'NA', 'Name': 'RRID'}, {'E
ntry': 'NA', 'Name': 'bio.tools'}]
2018-04-01 14:57:13,013 - INFO - (364): Registry entry OMICtools from source 
autodocktools was removed since it is NA.
2018-04-01 14:57:13,013 - INFO - (364): Registry entry RRID from source 
autodocktools was removed since it is NA.
2018-04-01 14:57:13,013 - INFO - (364): Registry entry bio.tools from source 
autodocktools was removed since it is NA.
2018-04-01 14:57:13,013 - DEBUG - (415): No registry data for source 
'autodocktools' of debian-med: upstream.registry = []


So this works as expected.

> bio-express (https://salsa.debian.org/med-team/express)

It is rather berkeley-express.  This works now[1].  The reason for the
problem was that the repository name is different from the source
package name.  The importer was simply wrong for these cases.  This is
fixed now (hopefully!) but we should definitely avoid this kind of
divergence and I'm tempted to adapt the repository name to the source
package name once we migrate anonscm.d.o to salsa.d.o (I've lost hope
for some sensible solution to keep anonscm working :-((().

> Bio-eagle

Works now[2].  Same as above.

> garli

Works (and as far as I can see it has also worked before - no idea why
it is on your list).

> gdpc

Same here - just works.

> gromacs

No Registry entries in d/upstream/metadata - so nothing to display here.
 
> a problem for me to report them is that I am uncertain about the time that
> needs
> to be passed until the update is run. Could you please help with a little
> line at the bottom alike:
> 
> This document is prepared every 24 hours for Debian packages selected in the
> https://salsa.debian.org/blends-team/med/blob/master/tasks/bio";>Med
> Bio Task Description withinformation in the Ultimate Debian Database ( href="https://udd.debian.org";>UDD).

The bottom line has a creation date of the web page.  Currently it says:

Last update: Mon, 02 Apr 2018 19:21:37 -

> It was last updated at [Date+Time].

This information is not available, sorry.

> The good part of the problem is that whenever I check if something was
> updated, I'll add RRIDs to another package. This typically is from within
> salsa. Is that possibly not triggering the same routines as when I push from
> a remote-from-salsa repository?

There is no difference between pushing from remote repository and a web
interface edit.

Kind regards

   Andreas.

[1] https://blends.debian.org/med/tasks/bio#berkeley-express 
[2] https://blends.debian.org/med/tasks/bio#bio-eagle

-- 
http://fam-tille.de



Re: RRID update on salsa on packages starting with A+B

2018-03-30 Thread Steffen Möller

Hi Andreas,

On 3/30/18 3:14 PM, Andreas Tille wrote:

Hi Steffen,

On Fri, Mar 30, 2018 at 01:11:28AM +0200, Andreas Tille wrote:

Please kindly have a look at clonalframe - it has a ref
for SciCrunch, is yamllint clean but nothing shows up.

H, good catch.

  clonalframe   | SciCrunch | SCR_016060

is in UDD.  I have no idea why it is not displayed on the tasks page.
Needs further investigation.

Fixed now.

Happy here. :)

I realised that importer and tasks pages generation scripts
were all perfectly working.  However, for a reason I did not understand
the gatherer was not run on the main UDD host.  I need to keep an eye on
this.


There are still packages that are not updated. Worthwhile candidates
are

autodock
bio-express (https://salsa.debian.org/med-team/express)
Bio-eagle
garli
gdpc
gromacs

a problem for me to report them is that I am uncertain about the time 
that needs
to be passed until the update is run. Could you please help with a 
little line at the bottom alike:


This document is prepared every 24 hours for Debian packages selected in 
the href="https://salsa.debian.org/blends-team/med/blob/master/tasks/bio";>Med 
Bio Task Description withinformation in the Ultimate Debian Database 
(https://udd.debian.org";>UDD).  It was last updated at 
[Date+Time].


The good part of the problem is that whenever I check if something was 
updated, I'll add RRIDs to another package. This typically is from 
within salsa. Is that possibly not triggering the same routines as when 
I push from a remote-from-salsa repository?


Best,

Steffen




Re: RRID update on salsa on packages starting with A+B

2018-03-30 Thread Andreas Tille
Hi Steffen,

On Fri, Mar 30, 2018 at 01:11:28AM +0200, Andreas Tille wrote:
> > Please kindly have a look at clonalframe - it has a ref
> > for SciCrunch, is yamllint clean but nothing shows up.
> 
> H, good catch.  
> 
>  clonalframe   | SciCrunch | SCR_016060
> 
> is in UDD.  I have no idea why it is not displayed on the tasks page.
> Needs further investigation.

Fixed now.  I realised that importer and tasks pages generation scripts
were all perfectly working.  However, for a reason I did not understand
the gatherer was not run on the main UDD host.  I need to keep an eye on
this.

Kind regards

  Andreas.

-- 
http://fam-tille.de



Re: RRID update on salsa on packages starting with A+B

2018-03-29 Thread Andreas Tille
On Thu, Mar 29, 2018 at 08:48:01PM +0200, Steffen Möller wrote:
> 
> > > But I found artemis
> > > (https://salsa.debian.org/med-team/artemis/tree/master/debian/upstream)
> > > which was not touched for a while and looks syntactically just fine but 
> > > has
> > > its RRIDs not shown.
> 
> Please kindly have a look at clonalframe - it has a ref
> for SciCrunch, is yamllint clean but nothing shows up.

H, good catch.  

 clonalframe   | SciCrunch | SCR_016060

is in UDD.  I have no idea why it is not displayed on the tasks page.
Needs further investigation.  And flexbar is also fully read:

 flexbar   | OMICtools | OMICS_01087
 flexbar   | bio.tools | flexbar
 flexbar   | SciCrunch | SCR_013001

I need to check this in the next couple of days. 

Kind regards

   Andreas.

-- 
http://fam-tille.de



Re: RRID update on salsa on packages starting with A+B

2018-03-29 Thread Steffen Möller


On 3/29/18 8:48 PM, Steffen Möller wrote:


On 3/28/18 1:14 PM, Andreas Tille wrote:

On Wed, Mar 28, 2018 at 12:03:54PM +0200, Steffen Möller wrote:

But I found artemis
(https://salsa.debian.org/med-team/artemis/tree/master/debian/upstream)
which was not touched for a while and looks syntactically just fine 
but has

its RRIDs not shown.


Please kindly have a look at clonalframe - it has a ref
for SciCrunch, is yamllint clean but nothing shows up.


For flexbar it is only the missing \n that is missing.

Cheers,

Steffen



Re: RRID update on salsa on packages starting with A+B

2018-03-29 Thread Steffen Möller


On 3/28/18 1:14 PM, Andreas Tille wrote:

On Wed, Mar 28, 2018 at 12:03:54PM +0200, Steffen Möller wrote:

But I found artemis
(https://salsa.debian.org/med-team/artemis/tree/master/debian/upstream)
which was not touched for a while and looks syntactically just fine but has
its RRIDs not shown.


Please kindly have a look at clonalframe - it has a ref
for SciCrunch, is yamllint clean but nothing shows up.

Many thanks

Steffen



Re: RRID update on salsa on packages starting with A+B

2018-03-28 Thread Dylan Aïssi
Hi Steffen, hi Andreas,

2018-03-28 14:16 GMT+02:00 Andreas Tille :
>
>> Via salsa, though, well, not. Should lintian invoke yamllint, possibly?
>
> As far as I know syntax errors in upstream files are fetched by lintian.
> Its the downside of just using the not yet uploaded commits as data
> input that we do not run lintian on these.
>

Currently, the lintian checks for integrity of d/u/metadata are
disabled due to some security problem [1] but maybe should be
re-enabled soon [2].

Best,
Dylan

[1] 
https://anonscm.debian.org/git/lintian/lintian.git/commit/checks/upstream-metadata.pm?id=6119d49c3b
[2] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=862373



Re: RRID update on salsa on packages starting with A+B

2018-03-28 Thread Andreas Tille
On Wed, Mar 28, 2018 at 02:10:33PM +0200, Steffen Möller wrote:
> 
> > Checking UDD log
> > 
> > blends_prospective_gatherer.log:2018-03-27 08:32:05,582 - ERROR - (110): 
> > Scanner error in file 
> > /srv/udd.debian.org/mirrors/machine-readable/a/artemis.upstream of 
> > debian-med: mapping values are not allowed here
> > blends_prospective_gatherer.log:   Entry: artemis
> 
> Excellent. Thank you (and yamllint) for spotting that.

You are welcome.
 
> Concerning yamllint I am a routine user whenever I edit via the command
> line. And I also keep fixing the long lines.

Nice. :-)
 
> Via salsa, though, well, not. Should lintian invoke yamllint, possibly?

As far as I know syntax errors in upstream files are fetched by lintian.
Its the downside of just using the not yet uploaded commits as data
input that we do not run lintian on these.

Kind regards

 Andreas.

-- 
http://fam-tille.de



Re: RRID update on salsa on packages starting with A+B

2018-03-28 Thread Steffen Möller


On 3/28/18 1:14 PM, Andreas Tille wrote:

On Wed, Mar 28, 2018 at 12:03:54PM +0200, Steffen Möller wrote:

But I found artemis
(https://salsa.debian.org/med-team/artemis/tree/master/debian/upstream)
which was not touched for a while and looks syntactically just fine but has
its RRIDs not shown.

Checking UDD log

blends_prospective_gatherer.log:2018-03-27 08:32:05,582 - ERROR - (110): 
Scanner error in file 
/srv/udd.debian.org/mirrors/machine-readable/a/artemis.upstream of debian-med: 
mapping values are not allowed here
blends_prospective_gatherer.log:   Entry: artemis


Excellent. Thank you (and yamllint) for spotting that.

Concerning yamllint I am a routine user whenever I edit via the command 
line. And I also keep fixing the long lines.


Via salsa, though, well, not. Should lintian invoke yamllint, possibly?

Best,

Steffen



Re: RRID update on salsa on packages starting with A+B

2018-03-28 Thread Andreas Tille
On Wed, Mar 28, 2018 at 12:03:54PM +0200, Steffen Möller wrote:
> But I found artemis
> (https://salsa.debian.org/med-team/artemis/tree/master/debian/upstream)
> which was not touched for a while and looks syntactically just fine but has
> its RRIDs not shown.

Checking UDD log

blends_prospective_gatherer.log:2018-03-27 08:32:05,582 - ERROR - (110): 
Scanner error in file 
/srv/udd.debian.org/mirrors/machine-readable/a/artemis.upstream of debian-med: 
mapping values are not allowed here
blends_prospective_gatherer.log:   Entry: artemis


Checking package:

$ yamllint debian/upstream/metadata
debian/upstream/metadata
  1:1   warning  missing document start "---"  (document-start)
  2:81  errorline too long (111 > 80 characters)  (line-length)
  2:111 errortrailing spaces  (trailing-spaces)
  3:81  errorline too long (126 > 80 characters)  (line-length)
  12:81 errorline too long (82 > 80 characters)  (line-length)
  13:81 errorline too long (139 > 80 characters)  (line-length)
  23:81 errorline too long (83 > 80 characters)  (line-length)
  24:81 errorline too long (164 > 80 characters)  (line-length)
  25:81 errorline too long (104 > 80 characters)  (line-length)
  34:81 errorline too long (84 > 80 characters)  (line-length)
  41:9  errorsyntax error: mapping values are not allowed here

Well, yamllint is a bit picky but the mapping values error matches the
UDD importer.

So lets have a look:

$ git diff
diff --git a/debian/upstream/metadata b/debian/upstream/metadata
index 4bede0a..a9906a8 100644
--- a/debian/upstream/metadata
+++ b/debian/upstream/metadata
@@ -37,5 +37,5 @@ Registry:
Entry: SCR_004267
  - Name: OMICtools
Entry: OMICS_00903
- - Name; bio.tools
+ - Name: bio.tools
Entry: artemis

... voila, the bug in question vanished.
 
> Please also have an eye on clonalframe
> (https://salsa.debian.org/med-team/clonalframe/tree/master/debian/upstream)
> which was not updated even though the other changes of mine seem to be all
> in.

Same here.  Importer says:

blends_prospective_gatherer.log:2018-03-27 08:34:50,809 - ERROR - (110): 
Scanner error in file 
/srv/udd.debian.org/mirrors/machine-readable/c/clonalframe.upstream of 
debian-med: mapping values are not allowed here


$ yamllint debian/upstream/metadata
debian/upstream/metadata
  1:1   warning  missing document start "---"  (document-start)
  21:10 errorsyntax error: mapping values are not allowed here


$ git diff
diff --git a/debian/upstream/metadata b/debian/upstream/metadata
index 79c7b56..98566d5 100644
--- a/debian/upstream/metadata
+++ b/debian/upstream/metadata
@@ -17,5 +17,5 @@ Registry:
 Entry: SCR_016060
   - Name: bio.tools
 Entry: NA
-  - Name. OMICtools
+  - Name: OMICtools
 Entry: NA


So please use yamllint on all your updates - if not on all at least
those that are suspicious since not showing up on the sentinel page.  It
seems just editing in Salsa online is a bit error prone.

>From time to time (every half year or so) I'm doing some QA on the UDD
log but inbetween I will not notice those things.  Here is the full list
of yaml issues in upstream files:

$ grep "ERROR.*\.upstream of" blends_prospective_gatherer.log 
2018-03-27 08:32:05,582 - ERROR - (110): Scanner error in file 
/srv/udd.debian.org/mirrors/machine-readable/a/artemis.upstream of debian-med: 
mapping values are not allowed here
2018-03-27 08:33:15,614 - ERROR - (110): Scanner error in file 
/srv/udd.debian.org/mirrors/machine-readable/o/octave-stk.upstream of 
pkg-octave: mapping values are not allowed here
2018-03-27 08:33:17,622 - ERROR - (110): Scanner error in file 
/srv/udd.debian.org/mirrors/machine-readable/o/octave-divand.upstream of 
pkg-octave: mapping values are not allowed here
2018-03-27 08:34:00,876 - ERROR - (110): Scanner error in file 
/srv/udd.debian.org/mirrors/machine-readable/r/rapmap.upstream of debian-med: 
mapping values are not allowed here
2018-03-27 08:34:50,809 - ERROR - (110): Scanner error in file 
/srv/udd.debian.org/mirrors/machine-readable/c/clonalframe.upstream of 
debian-med: mapping values are not allowed here


> I'll wait for tomorrow to indicate the others. Anyway, good to hear that no
> extra upload is required.

I can confirm that for this kind of data no upload was required for
years. 

Kind regards

  Andreas.

-- 
http://fam-tille.de



Re: RRID update on salsa on packages starting with A+B

2018-03-28 Thread Steffen Möller

Hi Andreas,

I'll get to the other points of your fine reply a bit later. The easy 
ones first


On 3/27/18 9:04 AM, Andreas Tille wrote:

Anyway. I came across

  * one or two entries

Which ones?


I had thought this would be dead easy to answer and then I added quite a 
few more references since indeed there was no RRID assigned, yet. Hm. 
Did I sometimes close the salsa editing window prior to pushing the edit?


But I found artemis 
(https://salsa.debian.org/med-team/artemis/tree/master/debian/upstream) 
which was not touched for a while and looks syntactically just fine but 
has its RRIDs not shown.


Please also have an eye on clonalframe 
(https://salsa.debian.org/med-team/clonalframe/tree/master/debian/upstream) 
which was not updated even though the other changes of mine seem to be 
all in.


I'll wait for tomorrow to indicate the others. Anyway, good to hear that 
no extra upload is required.


Steffen




Re: RRID update on salsa on packages starting with A+B

2018-03-27 Thread Andreas Tille
Hi Steffen,

On Mon, Mar 26, 2018 at 07:23:24PM +0200, Steffen Möller wrote:
> 
> I just procrastinated a bit into using the comfort of salsa to update
> debian/upstream/metadata and here the references to SciCrunch, OMICtools and
> bio.tools registries. All three registries have improved their coverage
> enormously over the past few months. I am deeply impressed.

Thanks a lot for the large update.
 
> Anyway. I came across
> 
>  * one or two entries

Which ones?

> that had perfect RRID descriptions on salsa but not on
> our task page - does the package need to be re-uploaded for the change to
> become visible?

Re-uploading is *not* needed.  The data come from Salsa Git repositories
(since about two weeks the machine-readable gatherer was pointed from
Alioth to Salsa).  However, there is an about 24 hour delay between
commits and visibility of the data on the web sentinel since at least
two cron jobs are involved (one that gathers the data and one that
creates the pages).

>  * belvu and blixem that are from the same source package but have different
> task entries and also separate catalog entries in all three registries. This
> breaks the current UDD schema. I have annotated it now as ['belvu','blixem']
> (for bio.tools, the others analogously).
> 
> Ideas for improvements anyone? Or is this how it should be for now?

I'm not sure.  In any case the current gatherer code will do nothing (at
best) or fail.  It seems that we are lucky and it does not break.  The
thing is that if we change our data model somebody (currently only me)
needs to adapt the code.  Currently there is no chance to resolve

 - Name: OMICtools
   Entry: ['OMICS_23183', 'OMICS_23184', 'OMICS_15828']

or

 - Name: SciCrunch
   Entry: ['SCR_015989','SCR_015994', 'NA']

How should the gatherer magically guess what binary package to choose?
The entry

 - Name: bio.tools
   Entry: ['belvu', 'blixem', 'dotter']

looks helpfull - but it is just pure luck that bio.tools has choosen IDs
matching our package names.  So I think your data model is not helpful
since there is no chance to define a sequence of the binary packages
build from one source package.  Thus we somehow need to define the
binary package name explicitly.

For citations we are using the field Debian-package[1] which is for
instance used for meme package[2] (just to have another example since
in seqtools also the dotter publication is marked like this).  However,
this is because I once added an additional field "package" to the bibref
table which looks for instance like this:


udd=# select * from bibref where (source = 'meme' or source = 'seqtools' ) and 
key = 'title';
  source  |  key  |  value  
 | package | rank 
--+---+--+-+--
 meme | title | MEME: discovering and analyzing DNA and protein sequence 
motifs  | |0
 meme | title | Discovering Sequence Motifs with Arbitrary Insertions and 
Deletions  | glam2   |0
 seqtools | title | SeqTools: visual tools for manual analysis of sequence 
alignments| |0
 seqtools | title | Scoredist: A simple and robust protein sequence distance 
estimator   | |1
 seqtools | title | A dot-matrix program with dynamic threshold control suited 
for genomic DNA and protein sequence analysis | dotter  |0
 seqtools | title | A workbench for large-scale sequence homology analysis  
 | |2


You see, packages with different names than the source packages got an
additional value in the package column since it was defined in our data
model first and implemented in the code afterwards.  However, the
registry table looks like this:


udd=# select * from registry where source = 'seqtools';
  source  |   name|   entry   
--+---+---
 seqtools | OMICtools | {OMICS_23183,OMICS_23184}
 seqtools | bio.tools | {belvu,blixem}
 seqtools | SciCrunch | {SCR_015989,SCR_015994}


That's the status before your last commit since the machine-readable
gatherer cron job was not run yet.  The gatherer takes what it gets and
injects it into the database.  Its not magic - its code that needs to be
adapted to a data model.  Changing the data model and hoping that
something sensible will happen is not working.

What we should clarify in advance is:  Does the source column in the
registry table make sense at all or should it rather be a package column
refering to binary packages?  The web sentinel is working on binary
packages so may be we should not keep source package names but rather
binary package names inside this table.  Alternatively w