A tougher Turing Test shows chatbots are still pretty stupid 

A simple way to defeat our robotic overlords.


By DAVID NIELD   15th JULY 2016 
http://www.sciencealert.com/a-tougher-turing-test-shows-chatbots-are-still-pretty-stupid?perpetual=yes&limitstart=1


To find out just how advanced our current AI systems are, researchers have 
developed a tougher Turing Test - called the Winograd Schema Challenge - which 
measures how well robotic intelligence matches human intelligence.

In the end, the team found that - even though AI is definitely improving every 
day - our robotic pals are still seriously lacking some common sense, 
suggesting that it will be some time before AI is fully ready to meld with 
society.

First, before we go any further into the new competition's results, it's 
important to define what a 'Turing Test' actually is. 

Developed and coined by Alan Turing back in the 1950s, the Turing Test is a way 
for researchers to challenge computer-based intelligence to see if it can 
become indistinguishable from human intelligence, which is basically the goal 
for AI researchers.

These tests are mostly language-based because human language is - when you 
truly think about it - super weird. 

In short, Turing believed that AI should aim to have robots that a human can 
talk to, without knowing they are robots. As anyone who has ever screamed at 
Siri knows, this is a task much easier said than done.

So, to test current AI systems, The Winograd Schema Challenge was created by 
Hector Levesque from the University of Toronto. Basically, the challenge pits 
artificial intelligence against sentences that are ambiguous but still simple 
for humans to understand.

http://commonsensereasoning.org/winograd.html

The best way to understand the test is to see a few samples in action. Take 
this question: 

"The trophy would not fit in the brown suitcase because it was too big. What 
was too big?"

The trophy, obviously, because if the suitcase was too big the sentence 
wouldn't make sense. But bots still struggle with this kind of language.

Here's another: 

"The city councilmen refused the demonstrators a permit because they feared 
violence."

Most of us would recognise that the city councilmen feared violence, rather 
than the demonstrators, but again that's tricky for a computer to understand.

The two winners  - both of which only came up with the correct interpretations 
48 percent of the time - of the contest were programmed by Quan Liu, from the 
University of Science and Technology of China, and Nicos Issak from the Open 
University of Cyprus. 

Unfortunately, a success rate of 90 percent is required to claim the $25,000 
prize so they couldn't cash in.

As Will Knight reports at the MIT Technology Review, this kind of understanding 
is difficult to create from statistical analysis (which is what computers are 
good at), but also takes an impossibly long time to code by hand.

https://www.technologyreview.com/s/601897/tougher-turing-test-exposes-chatbots-stupidity/

"It's unsurprising that machines were barely better than chance," said one of 
the contest's advisors, research psychologist Gary Marcus from New York 
University.

The Turing Test only asks bots to be clever enough for judges to be unsure if 
they're talking to a human or not – the Winograd Schema Challenge is on a whole 
new level that closely examines how well robots actually understand what people 
are saying to them.

The entrant submitted by Quan Liu, built with assistance from researchers at 
York University in Toronto and the National Research Council of Canada, used 
techniques known as deep learning, where software is trained on huge amounts of 
data to try and spot patterns and mimic the neuron activity going on in our own 
brains.

More powerful computers and more complex mathematical equations mean deep 
learning processes are improving quickly, but there's still that hard-to-define 
element of human common sense that's so hard to copy.

Liu and his team used thousands of sample texts to teach their bot the 
differences between different types of events like "playing basketball" or 
"getting injured".

While Apple, Google, and Facebook have been promoting digital assistants and 
bots of their own lately, none of them decided to enter the contest, and that's 
probably because their technology isn't ready.

"It could've been that those guys waltzed into this room and got a hundred 
percent and said 'hah!'" added Marcus. "But that would’ve astounded me."

If we really want our AI assistants to help us with everyday tasks then this 
kind of understanding is eventually going to be vital – it just might take us a 
long time to get there. 

The hope is that some advance will take place, raising the level of AI 
intelligence quickly, but that's very wishful thinking. 

Cheers,
Stephen



Sent from Mail for Windows 10

_______________________________________________
Link mailing list
Link@mailman.anu.edu.au
http://mailman.anu.edu.au/mailman/listinfo/link

Reply via email to