On May 23, Gozde Ozbal <gozba...@gmail.com> wrote: > But I see that in my current database, imdb_id column in title table > is all blank, although all other fields are full. Please also note > that I have downloaded almost all gz files from imdb interface.
Unfortunately, in the plain text data files there's no imdbID (the _real_ unique ID used by the IMDb web server, obviously every movie has its own movieID, but it's one built by IMDbPY, with no correlation with the other one). > Could you please explain which file causes this field to be set or > is there any other way to store the thumbnails in my local database > during the run of imdbpy2sql? No way. Or better, this will work: from imdb import IMDb ia = IMDb('sql', ...) movie = ia.get_movie(the_movie_id) # the one used in 'sql', not on the web. print ia.get_imdbID(movie) The get_imdbID takes an object (Movie, Person, Character, Company) and - connecting to the web server - tries to return the "real" imdbID. In the meanwhile, if retrieved, it's also stored in the SQL database (in the field you see all NULL) to prevent further queries. As a bonus, the imdbpy2sql.py script tries to preserve the imdbID stored in the database, when you're updating your local copy with a new set of plain text data files. > Otherwise, even if my imdb_ids become full, I will have to run other > scripts for storing thumbnails for each movie in my local database, > which will certainly cause great loss of time. This for sure, but honestly I _strongly_ discourage you to do it. IMDbPY was never intended to be used this way, and this is why there's no easy way to do what you're asking for (and never will be). Query the IMDb web server for every single movie (person, character, company) will be a (not so much, ok...) huge load for them and painfully slow for you. And probably not that useful, too. And the images are copyrighted, by the way. In my opinion, the get_imdbID should be used very conservatively, and only when really needed. I refuse to belive that a program has the need to fetch ahead of the time the thumbnail for every movie in the database. This - at most - should be done later, when a given user is importing/loading/populating/displaying its collection of movie (in the tens/hundreds of titles, I assume). And even used this way, I advice to at least change the cookies sent by IMDbPY to the web server, so that your program will be distinguished by IMDbPY itself. In your case, using 'sql', you have to monkey patch IMDbPY, import imdb.parser.http and overriding http._cookie_id and http._cookie_uu with the cookies from a newly created account on IMDb (see your browser for the values). HTH, -- Davide Alberani <davide.alber...@gmail.com> [GPG KeyID: 0x465BFD47] http://erlug.linux.it/~da/ ------------------------------------------------------------------------------ Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT is a gathering of tech-side developers & brand creativity professionals. Meet the minds behind Google Creative Lab, Visual Complexity, Processing, & iPhoneDevCamp asthey present alongside digital heavyweights like Barbarian Group, R/GA, & Big Spaceship. http://www.creativitycat.com _______________________________________________ Imdbpy-help mailing list Imdbpy-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-help