On Thu, 5 May 2005, Josh White wrote:
I don't think you'll get much participation unless you tellthe user thecomputer's best guess AFTER they click the answer. Thepoint is the usercan see how their answer improved the computer.
this feedback there is a slight risk that a person would start to think like Cyc rather than a naive and natural human
True... But I think that's the price we'll have to pay for getting free participants.
Fer shur dude!
It would be cleaner to pay participants a token amount of money to participate - I think that will greatly improve the quality because people feel like they're working, not like they're entertaining themselves.
Interesting. Well, I'm open to the idea of paying subjects but have limited funds. I suggest we keep pushing forward, try to get some mock ups, see what kinds of results we can get for free and then start paying as we need to, once the design is production level/good science.
www.hotornot.com has the same problem. Try it out with a friend standing over your shoulder - you'll see what I mean. You can feel yourself start to sway your own ratings, or even lie outright. I've seen three different people using it, and they all did not tell the comlete truth. Still, overall, the scores are pretty accurate. You do find extremely ugly people getting a higher score than you'd expect - that's about the only bias that shows up, that I've noticed.
Interesting.
distracting...The real problem will be keeping people's motivation and excitment up.)..
Well said. For that reason, I suggest:
- add a personality to the AI - name, sex, etc. I think HAL is perfect, but whatever.
HAL sounds excellent to me. I wonder if there is some sort of legal issue associated with using that name? I doubt it's trademarked. So lets just keep going and ask the next attorney friend we happen to bump into. Then again, if/when we get research cyc in the loop Cycorp may require us to change the name from HAL to Cyc. Whatever.
- keep the answer and the next question on the same page, same as hotornot
Yep.
- put the scientific explanations one link deep, from that page
Great idea.
Example page: --------------------------- You said HAL was wrong to say "Vienna is wet."
HAL 1000 thinks Vienna is wet because (1) "Rivers are a kind of water." (2) "If water touches x then x is wet." (3) "The Danube is a river." (4) "The Danube runs through Vienna." (5) "If a river runs through a region it touches that region." (Click [here] for the science behind this project.)
5 other users agreed with you, and1 other user did not. So far, you humans are convincing HAL he is wrong.
Hal also believes Paris is in France. Is that right?
[true] [false]
Excellent suggestions Josh. I think this will look really cool and be fun!
And I had a five point rating scale - highly unbelievable, unbelievable, neutral, believable, highly believable. But, I think we should keep it simple so, yes it should be true/false for now. We can experiment with finer gradations later.
Note that many of the things they will be rating will be simple "gafs" i.e. ground atomic formulae, such as "The Danube is a river." and these things do not have a complicated justification. So, instead of ...
Example page: --------------------------- You said HAL was wrong to say "Vienna is wet."
HAL 1000 thinks Vienna is wet because..blah blah blah
...it will be ...
Example page: --------------------------- You said HAL was right to say "The Danube is a river."
HAL 1000 thinks the Danube is a river because it was directly added to HAL's brain by one of HAL's human brain builders.
...Also, note that many (half?) of the items they will be rating will be reversed. E.g. "It is not the case that the Danube is a river." or if NL generation is working very well, "The Danube is not a river." (but don't count on that better English lingo keep your expectations low)...In this case, we will have to change the wording a little. Somethign like this...
Example page: --------------------------- You said HAL was wrong to say "The Danube is not a river."
Actually, HAL 1000 thinks "the Danube is a river" but for technical reasons [click here from explanation about the science behind this project] having to do with how our experiment is conducted we have purposefully mangled some of the things you are rating.
HAL 1000 thinks the Danube is a river because it was directly added to HAL's brain by one of HAL's human brain builders.
Anyway, HAL 1000 thinks the Danube is a river because it was directly added to HAL's brain by one of HAL's human brain builders.
..oye vay!!! that is a lot of text....We can boil this down. But, what are your thoughts Joshua? can you get some mock ups of this working?
Bill
----------------------------
-Josh
But OTHER items will be intentially reversed. We would predict that humans would rate these as less believable than unreversed items.
This is what I did in my dissertation. In study 2 half of the items were unreversed and the other half were reversed. In study 3, a third of the items were unreversed, another third were "slight" reversed, and another third were "strongly" reversed.
There are two reasons we want to do this reversal stuff. One reason is to catch liars or vandals.
The OTHER reason is to allow us to compare the mean believability of different groups of items. E.g....
unversed items vs reversed items human generated items vs machine generated items deductions vs ground facts
...this is all part of the computational ablation paradigm and it figured big time in my dissertation. It is an example of what I mean by good and rigorous methodology.
[Now, it occurs to me that there is a THIRD more minor, user-interfacey sorta reason to do this...That is that we want quick *coarse* judgements about whethher a commonsense assertion is a good one or not. One way to obtain such coarseness is to throw in a fair number of ridiculous assertions.]
Well, I should enlist some other AI gurus opinions on this before I spout off too loudly about good and rigorous methodology. Speaking of AI gurus, Peter, are you on this list yet?
So Josh and Joshua does this make sense conceptually, designwise?
Joshua, can you implement this. Note: Just getting the believability ratings up there is step one. Implementning the Feedback to User is step two.
Bill
-Josh
_______________________________________________ Heartlogic-dev mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/heartlogic-dev
