--- Gabriel R <[EMAIL PROTECTED]> wrote:
> Also, if you can think of any way to turn the knowledge-entry process into a
> fun game or competition, go for it.  I've been told by a few people working
> on similar projects that making the knowledge-providing process engaging and
> fun for visitors ended up being a lot more important (and difficult) than
> they'd expected.

Cyc has a game like this called FACTory at http://www.cyc.com/
It's purpose is to help refine its knowledge base.  It presents statements and
asks you to rate them as true, false, don't know or doesn't make sense.  For
example.

- Most shirts are heavier than most appendixes.
- Pages are typically located in HVAC Chem Bio facilities.
- Terminals are typically located in studies.
- People perform or are involved in paying a mortgage more frequenty than they
perform or are involved in overbearing.
- Most BTU dozer blades are wider than most T-64 medium tanks.

The game exposes Cyc's shortcomings pretty quickly.  Cyc seems to lack a world
model and a language model.  Sentences seem to be constructed by relating
common properties of unrelated objects.  The set of common properties is
fairly small: size, weight, cost, frequency (for events), containment, etc. 
There does not seem to be any sense that Cyc understands the purpose or
function of objects.  The result is that context is no help in disambiguating
terms that have more than one meaning, such as "appendix", "page", or
"terminal".

A language model would allow a more natural grammar, such as "People pay
mortgages more often than they are overbearing".  This example also exposes
the fallacy of logical inference.  Inference allows you to draw conclusions
such as this, but why would you?  Inference is not a good model of human
thought.  A good model would compare related objects.  It might ask instead
whether people make mortgage payments more frequently than they receive
paychecks.  The game gives no hint that Cyc understands such relations.

Cyc has millions of hand coded assertions.  It has taken over 20 years to get
this far, and it seems we are not even close.  This seems to be a problem with
every knowledge representation based on labeled graphs (frame-slot, first
order logic, connectionist, expert system, etc).  Using English words to label
the elements of your data structure does not substitute for a language model. 
Also, this labeling tempts you to examine and update the knowledge manually. 
We should know by now that there is just too much data to do this.


-- Matt Mahoney, [EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Reply via email to