Re: [Commons-l] Commons search function vs. Google

2011-10-14 Thread Gerard Meijssen
Hoi,
When you want to know about a subject like this, what do you learn at
Brittanica... also when I search for ejaculation at my Wikipedia there is no
word spelled like that.

This has become such a silly subject. It makes no difference to state that
there is controversial content. There is controversy in so many ways and it
is clear that the reasons for controversy are not universal.

If anything I am pretty pleased that this whole issue is no skin of my nose.
It is a no win situation or an all lose situation. The people who are
against controversial content will be unhappy with what they will consider
halfbaked. The people who allow for controversial content will be unhappy
because they will find frustrated by what they consider an obtrusion.
'
Commenting ad nauseam that there is controversial content is as bad. It does
not help.

PS Yes I am frustrated.
Thanks,
 GerardM

On 14 October 2011 14:27, Paul Houle  wrote:

> **
> On 10/12/2011 7:07 PM, Andreas Kolbe wrote:
>
>  Maarten,
>
>  The problem to solve is that people who are looking for an image of a
> cucumber or a children's toy
> may not appreciate being presented with an image where the item in question
> is used for masturbation.
>
>
>It's a general issue that content on Wikimedia Foundation sites is
> often not safe for a wide range of ages.  For instance,  consider the video
> on this page:
>
> http://en.wikipedia.org/wiki/Ejaculation
>
>   This isn't something you'd see in the encyclopedia Britannica.
> People often think of Wikipedia as a family-friendly place that's good for
> education,  and that's true for about 99.8% percent of it,  but somewhere
> around the 1-in-1,000 levels you find stuff that would have trouble with
> some people's idea of "community standards."
>
>
> ___
> Commons-l mailing list
> Commons-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/commons-l
>
>
___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l


Re: [Commons-l] Commons search function vs. Google

2011-10-14 Thread Paul Houle

 On 10/12/2011 7:07 PM, Andreas Kolbe wrote:

Maarten,

The problem to solve is that people who are looking for an image of a 
cucumber or a children's toy
may not appreciate being presented with an image where the item in 
question is used for masturbation.



  It's a general issue that content on Wikimedia Foundation sites 
is often not safe for a wide range of ages.  For instance,  consider the 
video on this page:


http://en.wikipedia.org/wiki/Ejaculation

  This isn't something you'd see in the encyclopedia Britannica.  
People often think of Wikipedia as a family-friendly place that's good 
for education,  and that's true for about 99.8% percent of it,  but 
somewhere around the 1-in-1,000 levels you find stuff that would have 
trouble with some people's idea of "community standards."


___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l


Re: [Commons-l] Commons search function vs. Google

2011-10-13 Thread Federico Leva (Nemo)
Federico Leva (Nemo), 11/10/2011 21:26:
> Looks like there are 248 exact file matches.
> 
>
> I see that the first image doesn't use information template, perhaps
> descriptions within templates are treated differently? Could be a wrong
> assumption based on how infoboxes work on Wikipedia. (Just more
> imaginative speculations...)

I've added the template and it's now 5th. Links seem to be still the 
same, so looks like that was the problem?

Nemo

___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l


Re: [Commons-l] Commons search function vs. Google

2011-10-12 Thread Andreas Kolbe
Maarten,

The problem to solve is that people who are looking for an image of a cucumber 
or a children's toy 
may not appreciate being presented with an image where the item in question is 
used for masturbation.

I asked Brandon about the search algorithm; he told me he had just answered the 
same question here:

http://www.quora.com/Why-is-the-second-image-returned-on-Wikimedia-Commons-when-one-searches-for-electric-toothbrush-an-image-of-a-female-masturbating


There are some comments from Pete Forsyth at that link as well; he noted that 
the same search results 
also appear for multimedia searches in the Wikipedias 
(e.g. http://www.webcitation.org/62OEEbIub ).

Cheers,
Andreas



>
>From: Maarten Dammers 
>To: commons-l@lists.wikimedia.org
>Sent: Wednesday, 12 October 2011, 20:38
>Subject: Re: [Commons-l] Commons search function vs. Google
>
>
>Hi Andreas,
>
>Op 11-10-2011 23:36, Andreas Kolbe schreef: 
>Maarten,
>>
>>
>>That sounds like the most plausible answer to me to date. We know that sexual 
>>images are among the most popular in Commons.
>>
>>

>> 
>>This is something the personal image filter would (in part) address. We could 
>>also have a look at our search algorithm.
That sounds like a solution to a problem, but you didn't actually state the 
problem. What's the problem you're trying to solve?
>
>Maarten
>
>
>
>
>___
>Commons-l mailing list
>Commons-l@lists.wikimedia.org
>https://lists.wikimedia.org/mailman/listinfo/commons-l
>
>
>___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l


Re: [Commons-l] Commons search function vs. Google

2011-10-12 Thread Maarten Dammers

Hi Andreas,

Op 11-10-2011 23:36, Andreas Kolbe schreef:

Maarten,

That sounds like the most plausible answer to me to date. We know that 
sexual images are among the most popular in Commons.




This is something the personal image filter would (in part) address. 
We could also have a look at our search algorithm.
That sounds like a solution to a problem, but you didn't actually state 
the problem. What's the problem you're trying to solve?


Maarten



___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l


Re: [Commons-l] Commons search function vs. Google

2011-10-11 Thread Andreas Kolbe
Maarten,

That sounds like the most plausible answer to me to date. We know that sexual 
images are among the most popular in Commons.

Some similar searches:


Underwater:

http://commons.wikimedia.org/w/index.php?title=Special%3ASearch&search=underwater&fulltext=Search


(The bondage image is not among the first 50 in Google with safe search off).

Jumping ball:

http://commons.wikimedia.org/w/index.php?title=Special%3ASearch&search=Jumping+ball&fulltext=Search


(That image is first in Google as well, even with strict safe search enabled.)

This is something the personal image filter would (in part) address. We could 
also have a look at our search algorithm.

Andreas







From: Maarten Dammers 
To: commons-l@lists.wikimedia.org
Sent: Tuesday, 11 October 2011, 21:04
Subject: Re: [Commons-l] Commons search function vs. Google


Hi Andreas,

Op 11-10-2011 17:22, Andreas Kolbe schreef:


>
>Why is our listing so different from the one in Google, and why are sexual 
>images so much higher up in our listing of search results?
>
My assumption is that the popularity (either incoming links or number of 
clicks) might be taken into account. See http://stats.grok.se/commons.m/top to 
see what people like to click on on Commons and cross reference that with the 
images that show up high in the search results.

Maarten

___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l


Re: [Commons-l] Commons search function vs. Google

2011-10-11 Thread Maarten Dammers

Hi Andreas,

Op 11-10-2011 17:22, Andreas Kolbe schreef:


Why is our listing so different from the one in Google, and why are 
sexual images so much higher up in our listing of search results?
My assumption is that the popularity (either incoming links or number of 
clicks) might be taken into account. See 
http://stats.grok.se/commons.m/top to see what people like to click on 
on Commons and cross reference that with the images that show up high in 
the search results.


Maarten
___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l


Re: [Commons-l] Commons search function vs. Google

2011-10-11 Thread Federico Leva (Nemo)
Andrew Gray, 11/10/2011 20:11:
> It may be that more controversial images provoke more meta-discussion,
> with more links to them as a result (from talkpages, deletion
> discussions, etc) and so are more likely to appear "popular" to the
> search system, but that's just a guess.

Hm, Lucene Streisand effect.

Béria Lima, 11/10/2011 20:31:
 > I guess that has something to do with the name of the images. The sexual
 > image has the name of File:Sexuality *pearl necklace* small.png
 > 

 > so, would be obvious to be one of the first results if you are looking
 > for *pearl necklace*.

Looks like there are 248 exact file matches. 

I see that the first image doesn't use information template, perhaps 
descriptions within templates are treated differently? Could be a wrong 
assumption based on how infoboxes work on Wikipedia. (Just more 
imaginative speculations...)

Nemo

___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l


Re: [Commons-l] Commons search function vs. Google

2011-10-11 Thread Béria Lima
I guess that has something to do with the name of the images. The sexual
image has the name of File:Sexuality *pearl necklace*
small.pngso,
would be obvious to be one of the first results if you are looking for
*pearl necklace*.
_
*Béria Lima*
(351) 925 171 484

*Imagine um mundo onde é dada a qualquer pessoa a possibilidade de ter livre
acesso ao somatório de todo o conhecimento humano. É isso o que estamos a
fazer .*


On 11 October 2011 16:53, WereSpielChequers wrote:

>
> --
>>
>> Message: 5
>> Date: Tue, 11 Oct 2011 16:22:37 +0100 (BST)
>> From: Andreas Kolbe 
>> Subject: [Commons-l] Commons search function vs. Google
>> To: Wikimedia Commons Discussion List 
>> Message-ID:
>><1318346557.48784.yahoomail...@web29620.mail.ird.yahoo.com>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> We are wondering on Meta[1]?what criteria the Commons search function uses
>> to establish the order of search results displayed.
>>
>>
>> To give some examples, searching for "pearl necklace" in Commons shows a
>> woman with sperm on her neck as the first image result:
>>
>>
>> http://commons.wikimedia.org/w/index.php?title=Special%3ASearch&search=pearl+necklace&fulltext=Search
>>
>>
>> The same image is way down in a Google search (with safe search off) for
>> pearl necklace on Commons:
>>
>>
>> http://www.google.co.uk/search?q=cucumber+site:commons.wikimedia.org&um=1&hl=en&sa=N&tbm=isch&bav=on.2,or.r_gc.r_pw
>> .,cf.osb&biw=&bih=774&uss=1#um=1&hl=en&safe=off&tbm=isch&sa=1&q=pearl+necklace+site:
>> commons.wikimedia.org&oq=pearl+necklace+site:commons.wikimedia.org
>> &aq=f&aqi=&aql=&gs_sm=e&gs_upl=113279l114967l0l115854l14l11l0l0l0l8l261l2003l0.8.3l11l0&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=49f703222a617ec&biw=&bih=774
>>
>>
>> Searching for "electric toothbrushes" in Commons shows a woman
>> masturbating with a toothbrush as the second image result:
>>
>>
>>
>> http://commons.wikimedia.org/w/index.php?title=Special%3ASearch&search=electric+toothbrushes&fulltext=Search
>>
>>
>> The same image turns up in Google as well (with safe search switched off),
>> though not as one of the first results:
>>
>>
>> http://www.google.co.uk/search?q=cucumber+site:commons.wikimedia.org&um=1&hl=en&sa=N&tbm=isch&bav=on.2,or.r_gc.r_pw
>> .,cf.osb&biw=&bih=774&uss=1#um=1&hl=en&safe=off&tbm=isch&sa=1&q=electric+toothbrushes+site:
>> commons.wikimedia.org&pbx=1&oq=electric+toothbrushes+site:
>> commons.wikimedia.org
>> &aq=f&aqi=&aql=&gs_sm=e&gs_upl=341351l344565l0l345961l21l19l0l0l0l13l255l3528l0.11.8l19l0&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=49f703222a617ec&biw=&bih=774
>>
>>
>> Searching for "cucumber" in Commons shows a woman with a cucumber up her
>> vagina on the first page of search results:
>>
>>
>> http://commons.wikimedia.org/w/index.php?title=Special%3ASearch&search=cucumber&fulltext=Search
>>
>> Doing a Google search for cucumber on Commons (with safe search off) does
>> not bring this image up among the first hundred or so results:
>>
>>
>> http://www.google.co.uk/search?q=cucumber+site:commons.wikimedia.org&um=1&hl=en&sa=N&tbm=isch&bav=on.2,or.r_gc.r_pw
>> .,cf.osb&biw=&bih=774&uss=1
>>
>>
>> Why is our listing so different from the one in Google, and why are sexual
>> images so much higher up in our listing of search results?
>>
>>
>> Andreas
>>
>>
>> [1]?http://meta.wikimedia.org/wiki/Controversial_content/Brainstorming?
>>
>>
> I don't know how Google does it, but I'd bet that our search prioritises by
> word order in the description. So a description that starts Pearl Necklace
> comes before "A white pearl necklace". If you amend the description them I
> suspect the search results will change.
>
> WereSpielChequers
>
>
> ___
> Commons-l mailing list
> Commons-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/commons-l
>
>
___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l


Re: [Commons-l] Commons search function vs. Google

2011-10-11 Thread Andrew Gray
On 11 October 2011 16:53, WereSpielChequers  wrote:

> I don't know how Google does it, but I'd bet that our search prioritises by
> word order in the description. So a description that starts Pearl Necklace
> comes before "A white pearl necklace". If you amend the description them I
> suspect the search results will change.

There's some notes on the internals of Lucene-search here:

http://www.mediawiki.org/wiki/User:Rainman/search_internals

"Article content" presumably is the same as the image description in
our context. I don't know quite what the "rank" metric would mean in
the Commons context - presumably, only links from local pages on
Commons count?

It may be that more controversial images provoke more meta-discussion,
with more links to them as a result (from talkpages, deletion
discussions, etc) and so are more likely to appear "popular" to the
search system, but that's just a guess.

-- 
- Andrew Gray
  andrew.g...@dunelm.org.uk

___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l


Re: [Commons-l] Commons search function vs. Google

2011-10-11 Thread WereSpielChequers
> --
>
> Message: 5
> Date: Tue, 11 Oct 2011 16:22:37 +0100 (BST)
> From: Andreas Kolbe 
> Subject: [Commons-l] Commons search function vs. Google
> To: Wikimedia Commons Discussion List 
> Message-ID:
><1318346557.48784.yahoomail...@web29620.mail.ird.yahoo.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> We are wondering on Meta[1]?what criteria the Commons search function uses
> to establish the order of search results displayed.
>
> To give some examples, searching for "pearl necklace" in Commons shows a
> woman with sperm on her neck as the first image result:
>
>
> http://commons.wikimedia.org/w/index.php?title=Special%3ASearch&search=pearl+necklace&fulltext=Search
>
>
> The same image is way down in a Google search (with safe search off) for
> pearl necklace on Commons:
>
>
> http://www.google.co.uk/search?q=cucumber+site:commons.wikimedia.org&um=1&hl=en&sa=N&tbm=isch&bav=on.2,or.r_gc.r_pw
> .,cf.osb&biw=&bih=774&uss=1#um=1&hl=en&safe=off&tbm=isch&sa=1&q=pearl+necklace+site:
> commons.wikimedia.org&oq=pearl+necklace+site:commons.wikimedia.org
> &aq=f&aqi=&aql=&gs_sm=e&gs_upl=113279l114967l0l115854l14l11l0l0l0l8l261l2003l0.8.3l11l0&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=49f703222a617ec&biw=&bih=774
>
>
> Searching for "electric toothbrushes" in Commons shows a woman masturbating
> with a toothbrush as the second image result:
>
>
>
> http://commons.wikimedia.org/w/index.php?title=Special%3ASearch&search=electric+toothbrushes&fulltext=Search
>
>
> The same image turns up in Google as well (with safe search switched off),
> though not as one of the first results:
>
>
> http://www.google.co.uk/search?q=cucumber+site:commons.wikimedia.org&um=1&hl=en&sa=N&tbm=isch&bav=on.2,or.r_gc.r_pw
> .,cf.osb&biw=&bih=774&uss=1#um=1&hl=en&safe=off&tbm=isch&sa=1&q=electric+toothbrushes+site:
> commons.wikimedia.org&pbx=1&oq=electric+toothbrushes+site:
> commons.wikimedia.org
> &aq=f&aqi=&aql=&gs_sm=e&gs_upl=341351l344565l0l345961l21l19l0l0l0l13l255l3528l0.11.8l19l0&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=49f703222a617ec&biw=&bih=774
>
>
> Searching for "cucumber" in Commons shows a woman with a cucumber up her
> vagina on the first page of search results:
>
>
> http://commons.wikimedia.org/w/index.php?title=Special%3ASearch&search=cucumber&fulltext=Search
>
> Doing a Google search for cucumber on Commons (with safe search off) does
> not bring this image up among the first hundred or so results:
>
>
> http://www.google.co.uk/search?q=cucumber+site:commons.wikimedia.org&um=1&hl=en&sa=N&tbm=isch&bav=on.2,or.r_gc.r_pw
> .,cf.osb&biw=&bih=774&uss=1
>
>
> Why is our listing so different from the one in Google, and why are sexual
> images so much higher up in our listing of search results?
>
>
> Andreas
>
>
> [1]?http://meta.wikimedia.org/wiki/Controversial_content/Brainstorming?
>
>
I don't know how Google does it, but I'd bet that our search prioritises by
word order in the description. So a description that starts Pearl Necklace
comes before "A white pearl necklace". If you amend the description them I
suspect the search results will change.

WereSpielChequers
___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l