On Wed, Nov 6, 2013 at 6:07 PM, Joseph De Nicolo <j.denic...@neu.edu> wrote: > > I used the imdbpy2sql.py program (version 4.9)
Please, use the version from the repository, since the 4.9 is way too buggy. > but my researchers are telling me some data is missing. Which ones? > Is there supposed to be a table to each corresponding .gz file in the ftp > mirror? No. The database schema is not normalized, basically for performance reasons (at insert time). Main information about titles and names are stored into the 'title', 'name', 'char_name' and 'company_name' tables. Information about movies are stored in the 'movie_info' table, specifically in the 'info' field, which is always a text field; it's meaning is given by the 'info_type_id' column (foreign key over 'info_type.id'). Information about persons are stored, in a similar way, in the 'person_info' table. Information about people who worked on a movie are stored in the 'cast_info' table: there, the 'person_role_id' refers to a role as per 'role_type' table. Hope this help; if you have more specific questions, just ask. -- Davide Alberani <davide.alber...@gmail.com> [PGP KeyID: 0x465BFD47] http://www.mimante.net/ ------------------------------------------------------------------------------ November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk _______________________________________________ Imdbpy-help mailing list Imdbpy-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-help