In a surprising turn of events, the latest survey
https://www.surveymonkey.com/s/F6CGPDJ shows that people consistently
prefer the chained thumbnails (each thumbnail generated based on the next
bigger thumbnail) to the ones we currently generate from the original and
to thumbnails generated all based on the largest thumbnail. Both in terms
of sharpness (not surprising, since there are more passes of sharpening due
to the chaining) and in terms of quality. I suspect that this is due to
sharpening being more pronounced visually.

JHeald on Commons village pump also brought up the fact that the resizing
we currently do with ImageMagick's -thumbnail introduces artifacts on some
images, which I've verified:
* using -thumbnail
https://dl.dropboxusercontent.com/u/109867/imagickchaining/sharpening/435-sharpened.jpg
* using -resize
https://dl.dropboxusercontent.com/u/109867/imagickchaining/sharpening/435-sharpened-resize.jpg
(I found out about that after the survey, for which all images have been
generated using the status quo -thumbnail option)

I'm pretty sure that we are using -thumbnail because it advertises itself
as being faster for large images. However with the testing I've done on
large images, it seems like if we were chaining thumbnail generation, the
performance gains would be so large that we could afford to use -resize and
avoid those artifacts, while still generating thumbnails much faster than
we currently do.

In conclusion, it seems safe to implement chaining where we maintain a set
of reference thumbnail, each generated based on the bigger one. Image
quality isn't impacted negatively by doing that, according to the survey.
And we would be able to use -resize, which would save us from artifacts and
improve image quality. Unless anyone objects, the Multimedia team can start
working on that change. I consider that the idea of generating those
reference thumbnails at upload time before the file is considered uploaded
to be as separate task, which we're also exploring at the moment.


On Fri, May 9, 2014 at 10:59 AM, Gilles Dubuc <gil...@wikimedia.org> wrote:

> After taking a closer look at what commands run exactly in production, it
> turns out that I probably applied the IM parameters in the wrong order when
> I put together the survey (order matters, particularly for sharpening).
> I'll regenerate the images and make another (hopefully better) survey that
> will compare the status quo, chained thumbnails and single thumbnail
> reference.
>
>
> On Mon, May 5, 2014 at 11:04 AM, Gilles Dubuc <gil...@wikimedia.org>wrote:
>
>> Buttons is French: Suiv. -> Make it English
>>>
>>
>> That's a bug in SurveyMonkey, the buttons are in French because I was
>> using the French version of the site at the time the survey was created,
>> and now that text on those buttons can't be fixed. I'll make sure to switch
>> SurveyMoney to English before creating the next one.
>>
>> No "swap" or "overlay" function for being able to compare
>>>
>>
>> SurveyMonkey is quite limited, it can't do that, unfortunately. The
>> alternative would be to build my own survey from scratch, but that would be
>> require a lot of resources for little benefit. This is really a one-off
>> need.
>>
>>
>>> I wonder if the mip-mapping approach could somehow be combined with
>>> tiles?
>>> If we want proper zooming for large images, we will have to split them up
>>> into tiles of various sizes, and serve only the tiles for the visible
>>> portion when the user zooms on a small section of the image. Splitting up
>>> an image is a fast operation, so maybe it could be done on the fly (with
>>> caching for a small subset based on traffic), in which case having a
>>> chain
>>> of scaled versions of the image would take care of the zooming use case
>>> as
>>> well.
>>
>>
>> Yes we could definitely have the reference thumbnail sizes be split up on
>> the fly to generate tiles, when we get around to implementing proper
>> zooming. It's as simple as making Varnish cache the tiles and the php
>> backend generate them on the fly by splitting the reference thumbnails.
>>
>> Regarding the survey I ran on wikitech-l, so far there are 26
>> respondents. It seems that on the images with a lot of edges (the test
>> images provided by Rob) at least 30% of people can tell the difference in
>> terms of quality/sharpness. On regular images people can't really tell.
>> Thus, I wouldn't venture to do the full chaining, as a third of visitors
>> will be able to tell that there's a quality degradation. I'll run another
>> survey later in the week where instead of full chaining all the thumbs are
>> generated based on the biggest thumb.
>>
>>
>>
>>
>> On Sat, May 3, 2014 at 1:25 AM, Gergo Tisza <gti...@wikimedia.org> wrote:
>>
>>> On Thu, May 1, 2014 at 7:02 AM, Gilles Dubuc <gil...@wikimedia.org>
>>> wrote:
>>>
>>> > Another point about picking the "one true bucket list": currently Media
>>> > Viewer's buckets have been picked based on the most common screen
>>> > resolutions, because Media Viewer tries to always use the entire width
>>> of
>>> > the screen to display the image, so trying to achieve a 1-to-1 pixel
>>> > correspondence makes sense, because it should give the sharpest result
>>> > possible to the average user.
>>> >
>>>
>>> I'm not sure the current size list is particularly useful for
>>> MediaViewer,
>>> since we are fitting images into the screen, and the huge majority of
>>> images are constrained by height, so the width of the image on the screen
>>> will be completely unrelated to the width bucket size. Having common
>>> screen
>>> sizes as width buckets would be useful if we would be filling instead of
>>> fitting (something that might make sense for paged media).
>>>
>>> ------
>>>
>>> I wonder if the mip-mapping approach could somehow be combined with
>>> tiles?
>>> If we want proper zooming for large images, we will have to split them up
>>> into tiles of various sizes, and serve only the tiles for the visible
>>> portion when the user zooms on a small section of the image. Splitting up
>>> an image is a fast operation, so maybe it could be done on the fly (with
>>> caching for a small subset based on traffic), in which case having a
>>> chain
>>> of scaled versions of the image would take care of the zooming use case
>>> as
>>> well.
>>> _______________________________________________
>>> Wikitech-l mailing list
>>> Wikitech-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>>
>>
>>
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to