On Mon, Dec 15, 2025 at 3:35 AM Lucas Nussbaum <[email protected]> wrote:

I've been working on orig-check, a service that tries to reproduce the
> generation of upstream tarballs (e.g. .orig.tar.gz) from what is
> described in the debian/watch file.
>
> See https://orig-check.debian.net/


I looked at the statistics page, at https://orig-check.debian.net/statistics,
as it exists at this instant (my local time as I start to write this is now
~ 2:50 PM east coast USA standard time, Dec 26, 2025).  Comments and
suggestions are below.  There's also a comment or two about the home page.

At this point I consider the service to be in a reasonable state, and
> I'm mainly interested in feedback, requests for improvements, etc.
>

(1) Perhaps provide a "totals" line at the very bottom for each suite.
Presumably the percentage for the total for each suite would be 100%.

(1a) Oh!  I assume the "totals" values for each suite from (1) would always
match the actual current number of packages in that suite (at least modulo
the last time the service ran on the packages from that suite, e.g. the
suite added a package after the service last processed packages from that
suite).  If not, this is a bug, correct?  Should such a bug announce itself
on the summary and/or home page?

(2) Perhaps provide a "subtotals" line and/or a separate summary grouping
table (with totals, of course!) for each logical group of results.  At this
instant, it seems to me the logical groups are 900/901/910, 800/801, 700,
200 thru 290 (and perhaps a subsubtotal for all the 280 values?), and 120.
If a separate summary grouping table, I'm not sure if it'd be best to put
it at the top or bottom of the page.  I don't know if it would make sense
(and/or be doable) to make the logical groups be clickable into a detailed
list as the individual results classes are.

(2a) A separate summary groupings table as in (2) would also make sense for
the home page for this service.  (Perhaps instead of putting it on the main
summary page?)  Again, I don't know if it would make sense / be doable to
have the logical groups be clickable.

(2b) Perhaps it would make sense to have a separate summary of the
"successful" logical groups (900 et al, 800 et al) compared to a total of
all successful groups (900 et al group is N% of all successful groups)
and/or all successful groups against all groups (successful groups are N%
of all groups).  Presumably ideally the ratio of 800 et al groups to 900 et
al groups would shrink over time as fewer packages require normalization (I
assume this is deemed desirable)?  And, also, the total of all successful
groups to all groups in total.

(3) Provide information, such as a timestamp, stating when the summary page
was generated; I assume the page is generated dynamically and/or
regenerated periodically as the service chugs away.  (This would also be
applicable to the home page, since at least some of it is
apparently dynamically generated.)

(3a) Provide information about when the data that the summary page
summarizes was last generated or processed.  (Also see (6) below.)

(4) Perhaps the statistics should be archived periodically, and/or as they
are regenerated.  If they are generated on-demand and/or frequently (say
every 10 minutes), then say hourly for a day or three, daily for a week or
two, weekly for a month or two, monthly for a year or two, and yearly
indefinitely.  (Obviously if they are not generated as frequently, then
archiving should not be as frequent as I propose either.)  This would allow
for analysis of any trends that might show up in this service over time.
Archives should be retrievable somehow.

(5) Provide a link at the top back to the home page for the service
(although I suppose the "Back to All Results" button on the bottom provides
that) and other boilerplate.  Information about the contact person (you)
and the source, copyright and license of the results and data as
applicable, thanks to DebianNet (and if this ever transitions to an
Official Service to DSA), etc.  Some of this would also be applicable on
the home page (some of this is already there of course).

(6) On the home page, provide information about the frequency and the last
time that the orig-check service runs and/or regenerates its data.  Is it
run manually?  As a cron job (when and/or how often)?  When a .dsc file is
uploaded and/or changed?  If when a .dsc file is uploaded / changed, is
there also a periodic instance of "dump everything and reboot, start from
scratch" processing?


That's everything I can think of off the top of my head.  Nothing really
novel here, just bits and pieces, "nice to haves", rounding out the bulk of
what you've already provided (which does look interesting and like it'll be
useful to people).

Hope this is of some use, interest.  Thanks for your time.  Be well.


Joseph

Reply via email to