In principle video on the internet may become a useful resource for
AGI learning.  However in practice the previous comments hand wave
over a lot of very complex details, in a similar manner to how some of
the early AI pioneers believed that visual interpretation would be
easy.

To be able to reverse engineer structures from an internet video you
need to be able to solve the SLAM problem without having any prior
knowledge of the properties of the camera optics, its pose or the
scale of the scene.  Programs such as Photosynth may be able to do
this, but I think this is only achieved by making many hypotheses
about the camera properties and turning the results into a big
optimization problem which takes a day or more of number crunching to
solve.

Alternatively simple 2D types of recognition could be used, but these
are usually extremely ad-hoc and suffer from scaling problems
(combinatorial explosions).  Some situations, such as recognizing
faces, may be special cases but it seems infeasible that 2D templates
need to be created for every possible viewing angle and scale of an
object.  If you try doing this in practice you soon come to the
conclusion that this probably isn't how biological vision works.  What
you can see in 2D is after all just a shadow of a higher dimensional
object.

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=231415&id_secret=28598886-ac3d77

Reply via email to