Matt, > BTW in your proposal to integrate the link parser, I did not see a > test plan. How do you plan to evaluate the results and compare it with > the system as it currently exists?
This is a tricky matter, because there are many "correct" answers for the semantic mapping of an English sentence, and the English verbalization of a semantic network... I think we will need to have a handful of expert humans rate the results on a small test corpus. At the moment I can't see any other way to do it.... It's much like asking how to evaluate an automated translation system -- ultimately some human who knows both languages, has got to check if the translation makes sense. (Here we are talking about translation between English, and OpenCog's internal formal language...) The current systems work so poorly, that comparing the new system to the current one will be pointless. RelEx and the link parser work fine, but the RelEx2Frame rules for translating RelEx output into OpenCog Atoms really suck; and the NLGen system currently deals effectively only with simple sentences.... So we won't need any scientific comparison to tell that the new system is working better than the old RelEx2Frame and NLGen.... (The old systems would be fixable in principle, but we decided it would be better to take a new approach...) For an OpenCog dialogue system, also, I think the evaluation is going to have to be qualitative based on explicit human ratings of the system's responses to various user utterances in various contexts. After all, the goal of "human-like conversation" is in the end a qualitative thing; the Turing Test itself is based on qualitative evaluation, though cleverly wrapped in a "can you distinguish this AGI from a human" test... -- Ben G ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
