On 10/12/23 at 12:10 +1100, Stuart Prescott wrote: > Package: qa.debian.org > Severity: normal > X-Debbugs-Cc: stu...@debian.org > > The 'maintainer' and 'maintainer_email' columns of the upload_history table > in UDD have truncated email addresses. Somewhere the 'maintainer' data > is being truncated and then the maintainer_email is consequently broken. > > udd=> SELECT maintainer, maintainer_email FROM upload_history WHERE > maintainer_email LIKE '%=' LIMIT 10; > maintainer | > maintainer_email > ----------------------------------------------------------------+---------------------------------------------- > Maintainers of GStreamer packages <pkg-gstreamer-maintainers@= | > pkg-gstreamer-maintainers@= > Maintainers of GStreamer packages <pkg-gstreamer-maintainers@= | > pkg-gstreamer-maintainers@= > Zenoss Packaging Team <pkg-zenoss-t...@lists.alioth.debian.or= | > pkg-zenoss-t...@lists.alioth.debian.or= > Debian GNOME Maintainers <pkg-gnome-maintainers@lists.alioth.= | > pkg-gnome-maintainers@lists.alioth.= > Debian Perl Group <pkg-perl-maintainers@lists.alioth.debian.o= | > pkg-perl-maintainers@lists.alioth.debian.o= > Debian VoIP Team <pkg-voip-maintain...@lists.alioth.debian.or= | > pkg-voip-maintain...@lists.alioth.debian.or= > Debian Python Modules Team <python-modules-team@lists.alioth.= | > python-modules-team@lists.alioth.= > Debian Python Modules Team <python-modules-team@lists.alioth.= | > python-modules-team@lists.alioth.= > Debian Firebird Group <pkg-firebird-gene...@lists.alioth.debi= | > pkg-firebird-gene...@lists.alioth.debi= > Debian Samba Maintainers <pkg-samba-maint@lists.alioth.debian= | > pkg-samba-maint@lists.alioth.debian= > (10 rows) > > > The input data from the d-d-c mailing list looks fine in the web archive, > but I can imagine this being due to linewrappig in the mbox files. > > Looking at one specific example: > > https://lists.debian.org/debian-devel-changes/2007/12/msg00466.html > > udd=> SELECT maintainer, maintainer_email FROM upload_history WHERE > maintainer_email LIKE '%=' AND source = 'libxml-rss-perl' AND version = > '1.31-3'; > maintainer | maintainer_email > ----------------------------------------------------------------+--------------------------------------------- > Debian Perl Group <pkg-perl-maintainers@lists.alioth.debian.o= | > pkg-perl-maintainers@lists.alioth.debian.o= > (1 row) > > This particular example is quite old but the problem also exists in > recent uploads; as of writing the most recent one is libgetdata (0.11.0-9) > that was uploaded today. > > Of the 850k rows in upload_history, this data issue is in 70k of them.
Hi, I did some changes to the email decoding that solved most cases. We are down to 1162 badly processed emails (from the 70k you reported): udd=> SELECT count(*) FROM upload_history WHERE maintainer_email LIKE '%='; count ------- 1162 They are all since 2022-08-27, which coincides with dak adding a detached signature. So there might still be something to fix in the code for that case. udd=> select source, version, date from upload_history where maintainer_email LIKE '%=' order by date asc limit 10; source | version | date ----------------------------+---------------+------------------------ libsweble-common-java | 3.0.8-3 | 2022-08-27 20:49:34+00 xeus | 2.4.0-2 | 2022-08-27 20:49:43+00 systemd | 251.4-3 | 2022-08-27 22:05:51+00 cross-toolchain-base-ports | 53 | 2022-08-28 10:04:10+00 opencascade | 7.6.3+dfsg1-3 | 2022-08-28 10:36:28+00 wvkbd | 0.10-1 | 2022-08-28 10:36:40+00 gobject-introspection | 1.73.0+ds-1 | 2022-08-28 10:49:10+00 yade | 2022.01a-11 | 2022-08-28 11:05:40+00 ruby-em-http-request | 1.1.7-1 | 2022-08-28 12:29:29+00 ruby-rails-i18n | 7.0.5-1 | 2022-08-28 14:51:31+00 (10 rows) Lucas