Re: [Wikitech-l] Captcha for non-English speakers II

Platonides Mon, 30 Jul 2012 15:23:44 -0700

On 30/07/12 15:28, Pau Giner wrote:
> From the UX perspective, a captcha is always an obstacle for the
> interaction flow.
I agree.
But when you're spammed to death if there's no captcha, you end up
accepting it as a necessary evil.
But don't let this pessimistic view stop you from proposing new
alternatives.



> Reducing the complexity of user interaction when solving the captcha can
> benefit all kinds of users but also solve problems for non-English speakers.
> 
> Checkbox and honeypot-based captchas avoid most of the problems of
> text-based captchas since interaction is simplified to the minimum for the
> user:
> http://uxmovement.com/forms/captchas-vs-spambots-why-the-checkbox-captcha-wins/

No. Those work against generic spambots. For a small site, pretty much
any custom-made captcha will work.
When someone designs against your captcha, you need to provide a hard test.
If we were comparing against a math captcha, checkbox is more usable
while only slightly weaker. None of them has a chance against a captcha
designed against them.

If you run Wikipedia, bad guys will work to defeat your captcha and
spam/vandalise/annoy you.
If you are developing MediaWiki, a wiki used in thousands of sites [1],
spammers will work to make bots capable to spam those many MediaWiki
installs (cf. DantMan reply)
If you are Open Source, then it's much harder to make (not only due to
security by obscurity of the code, but also of the own challenges...).


1- http://www.google.com/search?q=%22powered%20by%20mediawiki%22
~201.000.000 results



> Simple questions where the user can select an answer (not type) will solve
> some of the input-related issues for non-English speakers.
> These questions can be of different kinds (e.g., "Which one does not belong
> to the group: Red, Green, Skateboard, Blue?", "Is fire hot or cold?") and
> they can be based on text or image selection.
> An example of image-based captcha is available at
> http://www.picatcha.com/captcha/

No.
Those are *harder* since you need a knowledge of English language and terms.

I can fill in a text captcha in a foreign language site since its own
appearance (after being trained by hundreds of sites!) shows what it is
expected from me.
If I go to http://www.picatcha.com/captcha/, I am asked to "Select ALL
the images of «concept»". Which is fine but requires me to know what is
that «concept». I might eg. think that hourglasses are a kind of
spectacles (eyeglasses) and get very annoyed by not being able to pass it.

Also, making good questions is tricky. You need to produce loads of that
kind of questions with their answers, if you made just a few hundreds
(eg. it's done by a human), I could make a list of questions with their
answer (manually solved) and spam you as many times I want.

You want to make intelligent questions hard for bots, but anyone should
be able to solve them, even if they are young, uneducated or foreign.
I may know that I have to rule colors out, but I don't which of
skateboard vs turquoise is the color.
And yet, you can't dumbify it so much that a computer will be able to
answer it.

Suppose you are performing questions of type "Is X Y or Z?" and have
made thousands of pairs (that you can't share!).
A naive approach would just to answer Y or Z at random, accepting a 50%
of failure (bots don't mind resending their requests many times, a 50%
blocking captcha is broken). But we can do better, when you ask my bot
"Is fire hot or cold?" it could go and search google for those concepts:
* fire hot 1.210.000.000 results
* fire cold 656.000.000 results

There's a very clear correlation of fire with hot rather than with cold,
thus it chooses 'hot', and defeats your captcha. :)



> Tagging media can be also used as a captcha. Google has been experimenting
> with asking users to tag videos as a captcha:
> http://cups.cs.cmu.edu/soups/2009/proceedings/a14-kleuver.pdf  [PDF]

If we were doing this with Wikimedia Commons videos
a) The video set is known, as are the descriptions. Ergo, match the
video with its file and .
b) IMHO having to watch a video (even if short) is *more* annoying than
typing a text captcha.*
c) No/poor localisation.


* This needs to be balanced with how much you want to enter the
captcha-walled garden, of course. I may accept watching your CEO
boasting about your service (from which you then ask me the captcha**)
in exchange for a gmail-like mail account or multigigabyte dropbox
storage, but not to watch one everytime I sign in!

** Don't complain if he's tagged by most users as 'boring'. :)


> In any case, some experimentation would be required to determine any of the
> above approaches (or combination of several) provides an appropriate
> security-usability balance for the specific needs of the Wikipedia.

We would first need an evaluation of what is considered spam, and how to
measure. If we get lots of bots the next day you enable it, it's clearly
broken, but how much time would we need before being x% confident that
it is secure enough, when you are just waiting some random guy to decide
coding against your challenge?


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha for non-English speakers II

Reply via email to