On 10 March 2014 09:36, Daniel Kahn Gillmor <[email protected]> wrote: > 0) "reading aloud" comes from a video to keep a constant rate > > this does not match my experience in how fingerprints are compared when > read aloud. in practice, even if there are multiple listeners, there is > often pushback from the listeners, asking for a pair of octets to be > repeated, or for the reader to slow down. a video played at constant > speed (without the user able to control it) doesn't feel at all like the > same interaction, let alone representative of the same cognitive > challenge. I'm not sure how to solve this. maybe we just focus on the > business card approach?
My goal of the video was to introduce consistency in the speed of the reader. Allowing or not allowing them to repeat could go either way. I think I leaned away from it, because I suspect that allowing them to repeat (and telling them such) will introduce biases towards 'Success' because people now have a mechanism by which to assure success: make them repeat a lot. (It's like taking the time gate away from the time gated approach.) But, like I said, we can change. I do have to say that in the times I've done read-aloud key fingerprints... I'm slow. And I'm usually embarrassed to have them slow down or ask them to repeat. And I know that a single transposition of a hex character (e.g. leaving one out) is still an extremely computationally difficult task, so I just go with it ;) > 1) error rate for computationally-chosen flaw > > the spec currently suggests we'd run the ssh fingerprint look-alike > tool. assuming this tool works, it seems clear that this would bias the > results against the hex fingerprint, and toward the pseudoword or > english word outcome, since the look-alike tool currently embeds some > knowledge about sensitive cognitive comparisons in the hexadecimal > output space (e.g that it is a "better match" to flip two bits in the > first or second nybble of an octet than to flip one bit in each nybble). > maybe we could craft a look-alike tool for each of the different > mechanisms, encoding whatever domain-specific knowledge we have? Yes, absolutely. That wasn't clear in the doc but I did intend to create a look-alike tool for each fingerprint type. > 2) prevalence > > are we planning on showing the users three fingerprints, one which > matches exactly, one with a subtle flaw, and one with the > computationally chosen flaw? This would result in a mismatch prevalence > of 67% -- far, far higher than the prevalence of actual fingerprint > mismatches in most daily use. a known baseline prevalence rate seems > likely to affect the way most users think about these matches. Hm... so you're saying that because users see 2 out of 3 mismatches on the first couple tests, that's going to skew them towards 'Success' later because they expect more mismatches? Actually, thinking about it, having an obvious pattern of '2 out of the 3 for any type are incorrect' is also likely to skew things quite a bit. I suppose we need to throw in more matching ones and not have a distinguishable pattern. (This will balloon up the tests per subject though, which I was initially trying to minimize.) > I'm trying to think about what the user experience of the study would be > -- do they see the fingerprints to match in rapid succession? are they > embedded in some other task? (i think the other task would be the "head > fake" you describe; i'm not sure what that task would be, though). Exactly. And I'm not sure what the head fake is either. Maybe it's something like "Do task A, then talk with Bob, which requires this match, and Bob's going to tell you do task B and C." where tasks A, B, and C appear to be complicated things we're interested in measuring. -tom _______________________________________________ Messaging mailing list [email protected] https://moderncrypto.org/mailman/listinfo/messaging
