Dear all, As for semantic mapping evaluation, there are (as you may know) some textual entailment contests such as - PASCAL http://www.nist.gov/tac/2011/RTE/ or - NTCIR http://research.nii.ac.jp/ntcir/ntcir-10/tasks.html (-> Recognizing Inference in TExt ("RITE-2")) though I don't know if they are of AGIers' interest.
Just FYI. -- ARAKAWA, Naoya On 2012/12/28, at 1:31, Ben Goertzel <[email protected]> wrote: > Well, there are several things to test here > > -- the semantic mapping > > -- the language generation > > -- the overall behavior of the dialogue system > > I was talking about testing of the first two… > > On Thu, Dec 27, 2012 at 11:22 AM, Matt Mahoney <[email protected]> > wrote: >> On Thu, Dec 27, 2012 at 11:12 AM, Ben Goertzel <[email protected]> wrote: >>> I think we will need to have a handful of expert humans rate the >>> results on a small test corpus. >> >> What about preparing a set of questions and answers in advance? >> >> I am thinking about eliminating a source of bias, namely "I didn't >> think of that answer, but it is close enough". Of course this >> introduces a second bias, namely tuning the system to pass the test. >> For that you would need to withhold part of the test questions until >> the end. ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
