Matt,

> BTW in your proposal to integrate the link parser, I did not see a
> test plan. How do you plan to evaluate the results and compare it with
> the system as it currently exists?

This is a tricky matter, because there are many "correct" answers for
the semantic mapping of an English sentence, and the English
verbalization of a semantic network...

I think we will need to have a handful of expert humans rate the
results on a small test corpus.   At the moment I can't see any other
way to do it....   It's much like asking how to evaluate an automated
translation system -- ultimately some human who knows both languages,
has got to check if the translation makes sense.  (Here we are talking
about translation between English, and OpenCog's internal formal
language...)

The current systems work so poorly, that comparing the new system to
the current one will be pointless.   RelEx and the link parser work
fine, but the RelEx2Frame rules for translating RelEx output into
OpenCog Atoms really suck; and the NLGen system currently deals
effectively only with simple sentences....  So we won't need any
scientific comparison to tell that the new system is working better
than the old RelEx2Frame and NLGen....  (The old systems would be
fixable in principle, but we decided it would be better to take a new
approach...)

For an OpenCog dialogue system, also, I think the evaluation is going
to have to be qualitative based on explicit human ratings of the
system's responses to various user utterances in various contexts.
After all, the goal of "human-like conversation" is in the end a
qualitative thing; the Turing Test itself is based on qualitative
evaluation, though cleverly wrapped in a "can you distinguish this AGI
from a human" test...

-- Ben G


-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Reply via email to