Thank you for helping me with the log files!!! That is really helpful. There 
are quite a few duplicate titles: some are actually not duplicate but just have 
the same title. We need to come up with a new and better strategy for naming 
files! And indeed: some were uploaded before (for instance by me :-)
We also encountered some character set issues: I cannot explain where that came 
from but I will research it a bit more.

This has been a huge learning experience for me! So thank you for helping me.
Next upload will hopefully go a lot smoother (we also have some rough edges to 
deal with on our side :-))

Best
Lizzy


-----Oorspronkelijk bericht-----
Van: Glamtools [mailto:glamtools-boun...@lists.wikimedia.org] Namens bawolff
Verzonden: zondag 27 september 2015 16:47
Aan: Conversations revolving around the development of GLAM Digital Tools 
<glamtools@lists.wikimedia.org>
Onderwerp: Re: [Glamtools] upload from Rijksmuseum has stalled?

On Sun, Sep 27, 2015 at 3:27 AM, Lizzy Jongma <l.jon...@rijksmuseum.nl> wrote:
> Dear all,
>
> I restarted my upload on september 22nd: I started uploading approx 2600 
> images with works of art depicting birds. Until 25th of september the images 
> were ingested into Wikimedia Commons (they were slowly dripping in) but over 
> the last two days no new images are ingested. Does anyone knows what went 
> wrong/why the upload stopped (again) and how I can restart my job?
>
> I also have difficulties estimating how many images and which images were 
> ingested: which log can I check? I can’t find the relevant log to see what 
> was uploaded or what problems were reported etc.
>
> Thank you very much for your help.
>
> best wishes
> Lizzy Jongma


The logs for you in particular are at
https://commons.wikimedia.org/w/index.php?title=Special%3ALog&type=gwtoolset&user=LizzyJongma&page=&year=&month=-1&tagfilter=&uselang=en

If you want the list in xml or json format, you can do 
https://commons.wikimedia.org/w/api.php?action=query&list=logevents&letype=gwtoolset&leuser=LizzyJongma&lelimit=max&format=jsonfm
. See the api docs for details. In particular leaction can be used to filter by 
success/failure, and you need to use lecontinue parameter to get the next page 
of results.

There are several entries that appear to be "skipped" due to the image existing 
already in commons. The records numbers are in the 2800's (The record numbers 
will skip around a bit, but should very roughly go from low to high in order), 
so if you only have about 2600 images, I would guess that a the tool went 
through all the images

It seems like that perhaps the xml file has the same filename for several 
images. For example https://commons.wikimedia.org/wiki/File:Inhoudsopgave.jpeg 
is marked as being replaced by a second image after your initial upload, and in 
the log, it looks like record 2815 was going to replace that image a third 
time, except someone had edited that page in the meantime (
https://commons.wikimedia.org/w/index.php?title=File:Inhoudsopgave.jpeg&diff=173165602&oldid=173147674
), so gwtoolset failed instead of silently overwriting

I also notice several of the templates and file names have question marks in 
them. If the ? is unintentional, that might represent issue with charset 
conversions possibly.

GWtoolset is still quite rough around the edges, especially for showing 
progress and explaining what happened in case of errors.

--
-Brian

_______________________________________________
Glamtools mailing list
Glamtools@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/glamtools
_______________________________________________
Glamtools mailing list
Glamtools@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/glamtools

Reply via email to