ArielGlenn added a comment.
In T207030#4703693, @Smalyshev wrote:
Ah, ok, didn't see your comment - yes, we probably need to reduce or cancel small file check for lexemes, or eliminate empty shards. I am not sure how easy it is to do the latter - I am on vacation this week so I'd start with the fo
ArielGlenn added a comment.
Job has not started yet so the change should have made it out in time for this week's run.TASK DETAILhttps://phabricator.wikimedia.org/T207030EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ArielGlennCc: gerritbot, Aklapper, Smalysh
gerritbot added a comment.
Change 470447 merged by ArielGlenn:
[operations/puppet@production] Reduce small file size for lexemes
https://gerrit.wikimedia.org/r/470447TASK DETAILhttps://phabricator.wikimedia.org/T207030EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferenc
gerritbot added a comment.
Change 470447 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[operations/puppet@production] Reduce small file size for lexemes
https://gerrit.wikimedia.org/r/470447TASK DETAILhttps://phabricator.wikimedia.org/T207030EMAIL PREFERENCEShttps://phabricato
Smalyshev added a comment.
Ah, ok, didn't see your comment - yes, we probably need to reduce or cancel small file check for lexemes, or eliminate empty shards. I am not sure how easy it is to do the latter - I am on vacation this week so I'd start with the former and go back to the latter after I'm
Smalyshev added a comment.
Small batches is normal - these are empty or semi-empty shards I guess. I wonder though why they are not proceeded to create full dump. Maybe small file check is not correct?TASK DETAILhttps://phabricator.wikimedia.org/T207030EMAIL PREFERENCEShttps://phabricator.wikimedia
ArielGlenn added a comment.
root@snapshot1008:~# more /var/log/wikidatadump/dumpwikidatattl-wikidata-20181028-lexemes-BETA-main.log
File size of is only 518223. Aborting.
The file size cutoff is 2000/8 = 250. So that's why no files wind up in the output directory.
Also, the error message
ArielGlenn added a comment.
Nope. Something else is wrong. I see no cronspam, no lexeme job running now on the snapshot host, this week's json job has started, but the 'latest' file is Oct 14. There is a 20181028 directory but it is empty.
There are a bunch of temp files left in /mnt/dumpsdata/xmld
ArielGlenn added a comment.
This is now deployed on snapshot1008 (where cron jobs run). We'll know next Monday if this took care of the problem; let's leave the task open til then.TASK DETAILhttps://phabricator.wikimedia.org/T207030EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/e
gerritbot added a comment.
Change 467415 merged by ArielGlenn:
[operations/puppet@production] Fix lexeme error msgs
https://gerrit.wikimedia.org/r/467415TASK DETAILhttps://phabricator.wikimedia.org/T207030EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: gerrit
Smalyshev added a comment.
I think I found the bug. From what it looks like it shouldn't have influenced the dump.TASK DETAILhttps://phabricator.wikimedia.org/T207030EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: gerritbot, Aklapper, Smalyshev, A
gerritbot added a comment.
Change 467415 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[operations/puppet@production] Fix lexeme error msgs
https://gerrit.wikimedia.org/r/467415TASK DETAILhttps://phabricator.wikimedia.org/T207030EMAIL PREFERENCEShttps://phabricator.wikimedia.o
12 matches
Mail list logo