In principle video on the internet may become a useful resource for AGI learning. However in practice the previous comments hand wave over a lot of very complex details, in a similar manner to how some of the early AI pioneers believed that visual interpretation would be easy.
To be able to reverse engineer structures from an internet video you need to be able to solve the SLAM problem without having any prior knowledge of the properties of the camera optics, its pose or the scale of the scene. Programs such as Photosynth may be able to do this, but I think this is only achieved by making many hypotheses about the camera properties and turning the results into a big optimization problem which takes a day or more of number crunching to solve. Alternatively simple 2D types of recognition could be used, but these are usually extremely ad-hoc and suffer from scaling problems (combinatorial explosions). Some situations, such as recognizing faces, may be special cases but it seems infeasible that 2D templates need to be created for every possible viewing angle and scale of an object. If you try doing this in practice you soon come to the conclusion that this probably isn't how biological vision works. What you can see in 2D is after all just a shadow of a higher dimensional object. ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=231415&id_secret=28598886-ac3d77