Package: www.debian.org User: www.debian....@packages.debian.org Usertag: scripts Severity: normal
Hello all (This is a long bug report, I suspect several different issues converge here. Sorry for the long mail). Since some weeks, from time to time we receive a "new" kind of "Validation error", e.g. this one from 7 Dec 2017: *** Errors validating /srv/www.debian.org/www/devel/wnpp/being_packaged_byactivity.en.htm l: *** Line 1, character 1: missing document type declaration; assuming HTML 4.01 Transitional This happens when our "validate" scripts tries to analize an html file that is actually empty (zero size). I don't know: * why some times the build script (or the validation tool) produces these zero size files (let's call this ISSUE_A). I couldn't reproduce the issue in local. * If this was happening more often but the validate program in jessie was not complaining, and the validate program in stretch (more modern yeah!) complains, and that's why we notice the issue now. * If somebody acted on the files reported on 7 Dec 2017 (I didn't). All the zero size HTML files were under /devel/wnpp so I guess the files were automatically rebuilt not long after the issue, and that time, the build was correct. I see the files are shown well in the website and we didn't get more validation errors about these files. On 11 Dec 2017 I received again similar "Validation error" mails. In particular, about these 6 files: *** Errors validating /srv/www.debian.org/www/News/weekly/2013/19/index.it.html: *** Line 1, character 1: missing document type declaration; assuming HTML 4.01 Transitional *** Errors validating /srv/www.debian.org/www/News/weekly/2004/42/index.pt.html: *** Line 1, character 1: missing document type declaration; assuming HTML 4.01 Transitional *** Errors validating /srv/www.debian.org/www/News/weekly/2005/index.pt.html: *** Line 1, character 1: missing document type declaration; assuming HTML 4.01 Transitional *** Errors validating /srv/www.debian.org/www/News/2003/20030728.sv.html: *** Line 1, character 1: missing document type declaration; assuming HTML 4.01 Transitional *** Errors validating /srv/www.debian.org/www/News/press/2001.sv.html: *** Line 1, character 1: missing document type declaration; assuming HTML 4.01 Transitional *** Errors validating /srv/www.debian.org/www/News/weekly/2000/22/index.sv.html: *** Line 1, character 1: missing document type declaration; assuming HTML 4.01 Transitional That time I could have a look at the build server, and saw that all those files were zero size, and created on the evening of 9 Dec 2017 (during the website rebuild for a point release). I looked at the files with 0 size in the website build machine: larjona@wolkenstein:/srv/www.debian.org/www$ sudo -u debwww find . -size 0 and found the files that generated the validation errors, and some other files (.err, .log and other temp files). I removed the HTML files with zero size, with sudo -u debwww rm /srv/www.debian.org/News/path-to/file.XX.html and then expected that the next website build would regenerate the files and everybody happy. We didn't receive more "Validation errors" so I thought the problem was solved (well, ISSUE_A (why these zero size files are generated) stands, but we can investigate further if/when the problem reappears). Yesterday 18 Dec 2017 we received again "validation errors" related to zero size HTML files: *** Errors validating /srv/www.debian.org/www/News/weekly/2013/19/index.it.html: *** Line 1, character 1: missing document type declaration; assuming HTML 4.01 Transitional *** Errors validating /srv/www.debian.org/www/News/2003/20030728.sv.html: *** Line 1, character 1: missing document type declaration; assuming HTML 4.01 Transitional *** Errors validating /srv/www.debian.org/www/News/press/2001.sv.html: *** Line 1, character 1: missing document type declaration; assuming HTML 4.01 Transitional *** Errors validating /srv/www.debian.org/www/News/weekly/2000/22/index.sv.html: *** Line 1, character 1: missing document type declaration; assuming HTML 4.01 Transitional *** Errors validating /srv/www.debian.org/www/News/weekly/2004/42/index.pt.html: *** Line 1, character 1: missing document type declaration; assuming HTML 4.01 Transitional *** Errors validating /srv/www.debian.org/www/News/weekly/2005/index.pt.html: *** Line 1, character 1: missing document type declaration; assuming HTML 4.01 Transitional Again the same files than last week! And I've checked that the date of those files is again 9 Dec 2017 (!) So I guess one of these things happened: * My "rm" command is not working and the files are not actually deleted (maybe I'm deleting them in the wrong folder/machine?). If this is the case, I'd like to know how should I proceed to actually remove the files (let's call it ISSUE_B) and why we didn't receive validation errors mails every day since 11 Dec 2017 (let's call it ISSUE_C). * My "rm" command worked but some process again put the files in their folders after my removal (and maybe that was run just yesterday, and hence the lack of validation error mails until yesterday?) I'm not sure what can I do to try to solve the issues. A *workaround* that comes to mind is to make a dummy commit to the 6 files, so they are rebuilt (truly rebuilt) again and get rid of the validation mails for this time. Another thing that we can do is to remove only one or two of the HTML files and see what happens. But we only keep logs of the last two website builds so I'll try to do it when I can be sure that I have time to see the logs of the following build. Meanwhile, I leave this bug open for the case it rings a bell to somebody. Best regards -- Laura Arjona Reina https://wiki.debian.org/LauraArjona