Re: [Imdbpy-help] Data too long for column 'phonetic_code' at row 88

Davide Alberani Sat, 23 May 2009 09:43:52 -0700

On May 23, Gozde Ozbal <gozba...@gmail.com> wrote:

> But I see that in my current database, imdb_id column in title table
> is all blank, although all other fields are full. Please also note
> that I have downloaded almost all gz files from imdb interface.


Unfortunately, in the plain text data files there's no imdbID (the
_real_ unique ID used by the IMDb web server, obviously every movie
has its own movieID, but it's one built by IMDbPY, with no correlation
with the other one).

> Could you please explain which file causes this field to be set or
> is there any other way to store the thumbnails in my local database
> during the run of imdbpy2sql?

No way.
Or better, this will work:

  from imdb import IMDb
  ia = IMDb('sql', ...)
  movie = ia.get_movie(the_movie_id) # the one used in 'sql', not on the web.
  print ia.get_imdbID(movie)

The get_imdbID takes an object (Movie, Person, Character, Company)
and - connecting to the web server - tries to return the "real" imdbID.
In the meanwhile, if retrieved, it's also stored in the SQL database
(in the field you see all NULL) to prevent further queries.
As a bonus, the imdbpy2sql.py script tries to preserve the imdbID
stored in the database, when you're updating your local copy with
a new set of plain text data files.

> Otherwise, even if my imdb_ids become full, I will have to run other
> scripts for storing thumbnails for each movie in my local database,
> which will certainly cause great loss of time.

This for sure, but honestly I _strongly_ discourage you to do it.

IMDbPY was never intended to be used this way, and this is why
there's no easy way to do what you're asking for (and never will be).
Query the IMDb web server for every single movie (person, character,
company) will be a (not so much, ok...) huge load for them and painfully
slow for you.  And probably not that useful, too.  And the images
are copyrighted, by the way.

In my opinion, the get_imdbID should be used very conservatively,
and only when really needed.
I refuse to belive that a program has the need to fetch ahead of
the time the thumbnail for every movie in the database.
This - at most - should be done later, when a given user is
importing/loading/populating/displaying its collection of movie
(in the tens/hundreds of titles, I assume).

And even used this way, I advice to at least change the cookies
sent by IMDbPY to the web server, so that your program will be
distinguished by IMDbPY itself.
In your case, using 'sql', you have to monkey patch IMDbPY,
import imdb.parser.http and overriding http._cookie_id and
http._cookie_uu with the cookies from a newly created account
on IMDb (see your browser for the values).


HTH,
-- 
Davide Alberani <davide.alber...@gmail.com> [GPG KeyID: 0x465BFD47]
http://erlug.linux.it/~da/

------------------------------------------------------------------------------
Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT
is a gathering of tech-side developers & brand creativity professionals. Meet
the minds behind Google Creative Lab, Visual Complexity, Processing, & 
iPhoneDevCamp asthey present alongside digital heavyweights like Barbarian
Group, R/GA, & Big Spaceship. http://www.creativitycat.com 
_______________________________________________
Imdbpy-help mailing list
Imdbpy-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-help

Re: [Imdbpy-help] Data too long for column 'phonetic_code' at row 88

Reply via email to