On 1 May 2014 05:18, Joseph Bonneau <[email protected]> wrote: > [Starting a new thread on this] > > Sorry for being a week late, but I read over Tom's proposal at: > https://github.com/tomrittervg/crypto-usability-study > > Having spent much of the last two weeks reviewing papers for SOUPS > (Symposium on Usable Privacy and Security) and discussing design flaws in > many security usability studies, there are a few points I'm concerned about: > > *Assigning study participants to multiple treatments (e.g. having them test > multiple methods of fingerprint comparison) introduces a number of issues. > You have data points that are now correlated, so you can just average > performance for each treatment and compare. The statistics get vastly more > complicated to do correctly and your statistical power goes down. More > importantly, there are real external validity questions here. Users will > learn and be more clever and alert after seeing multiple systems. Most users > will only ever see one. > > *While not really stated, I'm imagining the same participant will be asked > to perform multiple trials for each treatment. This also introduces some > complexity when doing the statistical analysis, but is more manageable. > > *Also not really stated is the "base rate" of errors. If users are doing the > experiment multiple times, we want the base rate to be extremely low to have > reasonable validity. In reality, fewer than 1% of fingerprints you ever > compare are going to mismatch. People do an approximation of Bayesian > reasoning, and if their prior probability is 0.99 that the fingerprints > match, this is much different than in an experiment if they're mismatching > half the time and participants come to expect it. There are two ways around > this: a deception study (the "head fake" approach as Tom put it) or having > users do the task many times and the vast majority of fingerprints match. > > *This proposal includes 10 experimental treatments. That's a lot. If we > don't re-use participants, we already need 10 people just to have one person > try each experiment, and that's assuming the phone method doesn't take two > people. If we ask them all to do many dummy trials with matching > fingerprints to screen for errors, this is an utterly impractical study. > > Overall I think it will be nigh-impossible to do this study in-person and > have a sufficient sample size. I propose doing the study online using Amazon > Mechanical Turk (mTurk). This is now standard for psychology experiments in > general and security usability experiments as well. While not perfect > multiple studies have confirmed this is a much more representative user > population than in-person studies ever obtain. > > I would do the experiment as follows: > *For the phone comparison method, play an audio recording of somebody > reading the fingerprint and display it on screen for comparison. This isn't > perfect, as it's non-interactive, but it's a start. > *For the business card method, show a JPEG of a business card, and have them > compare to a version rendered in text. > *Assign each user to only 1 treatment and have many more total users. > Because you pay users for time, this is basically cost-neutral. > *Give each user 50-100 trials, and have perhaps 1-5 of them be incorrect. > This would best be randomized since occasionally users talk out of band > about studies. Have a 50-50 mix of random and targeted errors. > > Now for 10 treatments of user we, we can aim for 100 users per treatment. We > may want to adjust this based on some pilot studies. If the experiment takes > 10 minutes, this is probably about $1 per user. So we're talking $1000. > That's a lot, but an in-person user study would be far more.
I like this plan a lot. I am curious whether its a good idea to give people so many trials - presumably for most people, verification is a relatively infrequent thing and so allowing them to become familiar with it seems like it would bias the results. _______________________________________________ Messaging mailing list [email protected] https://moderncrypto.org/mailman/listinfo/messaging
