Ben's tweet of Eliezer Yudkowsky's dire forecast made 8 years ago seems rather humorous today.
http://acceleratingfuture.com/sl4/archive/0501/10611.html As does Ben's response. http://acceleratingfuture.com/sl4/archive/0501/10613.html Really, OpenCog (then Novamente) is going to recursively self improve and kill us all? So what went wrong? I've been lurking on the OpenCog mailing list for a couple of years. There is a lot of software development being done. But it is hard to tell if any real progress is being made because there is still no (nor has there ever been) a test set by which progress could be measured. Ben has mentioned a few ideas for tests, like getting an online university degree, or playing with a box of toys. But we aren't there yet. I can imagine a potential investor asking when will we get there, and the answer will either be some made-up date or "I don't know". And we know what happens with made-up dates. Why don't we know? Because there are no tests of incremental progress. So as far as anyone can tell, there has been no progress since 2005. During the 3 years it took to build Watson, the team tested it on Jeopardy games and watched its precision (at 50% recall rate) gradually improve from 15% accuracy to 90%, the level they needed to beat the best humans. Every 3 months, they saw a 10% increase and knew they were on the right track and could even forecast a completion date. What does OpenCog have that is equivalent to this? Here is another example. How much more knowledge does Cyc need to add to its sea of assertions to "break the software brittleness bottleneck"? That was the goal in 1984 when the project was started. Of course, nobody knows. Why not? Because there is no test for measuring progress. I've made a rough draft of the cost of AGI which many of you have already read. https://docs.google.com/document/d/1cQiaH81rB5l9eLRYZFSi_tOLzRzOsY8wVruimPUWybg/edit If you think this is wrong, then please come up with some tests to prove it. Here are some simple ones for now: - Fill in missing words in text, with the goal of human level accuracy. (Can RelEx or MOSES do this)? - Recognize printed words or common objects in images with human level accuracy (can DeSTIN do this)? - Teach a robot to throw a ball. These are nice, simple tests that give a numeric answer. But you will notice in preparing the test set, that even this step is not trivial. Then maybe you can tell me what it will cost to solve these problems, in terms of hardware, software, and training data. -- -- Matt Mahoney, [email protected] ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
