Thanks for suggestions! But, unfortunately, nothing helped (i tried all of your suggestions). I'm upgraded SQLAlchemy and psycopg2 to the latest versions using easy_install so now i think our configurations identical, except for PostgreSQL version (still using 8.3). Trying to find out where exactly my problem occurs i slightly modified "readMovieList()" function at line 1427 of imdbpy2sql.py, so now it looks like: mid = CACHE_MID.addUnique(title.decode('utf-8'), yearData) now it only says: ... SCANNING movies: #1 (2005) (movieID: 1) SCANNING movies: A Fu (2000) (movieID: 10001) SCANNING movies: Absconding (2005) (V) (movieID: 20001) SCANNING movies: Alone and Inhuman (2005) (movieID: 30001) SCANNING movies: Apartment 23 (1919) (movieID: 40001) SCANNING movies: Babalon (1999) (movieID: 50001) SCANNING movies: Belye nochi (1959) (movieID: 60001) SCANNING movies: Blood on the Turntable (2004) (TV) (movieID: 70001) SCANNING movies: Bullets for Bandits (1942) (movieID: 80001) SCANNING movies: Celebration (2010) (movieID: 90001) * FLUSHING MoviesCache... /usr/lib/python2.5/site-packages/IMDbPY-4.2-py2.5.egg/EGG-INFO/scripts/imdbpy2sql.py:626: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if ts[-1].lower() in _articles: ... and then it continues to importing data. This error occurs only once. Any ideas how i can intercept this UnicodeWarning so i can see at what line of movie.list it happen? I tried to add next line to readMovieList(): print "Title: " + title + ", counter: " + str(count) but the output is soooo big... tonns of lines, over9000.
BTW, after that change (decode('utf-8')) DB content is looking good, but as it contains almost 1.5M records i can't be sure for 100%. >Some questions and tests you can try: > 1. Did it ever worked on your system (with an old set of data, for example)? Yeah, it worked some years ago :) This is my first try at this year. > 2. Have you tried to change the value of client_encoding in postgresql.conf? I tried changing it from default but it makes no difference. I'am using client_encoding=UTF8 currently, and as it is commented in postgresql.conf "actually defaults to database encoding". My "imdb" encoding is UTF8. On Sun, 4 Oct 2009 16:27:49 +0200, Davide Alberani <davide.alber...@gmail.com> wrote: > On Oct 04, Davide Alberani <davide.alber...@gmail.com> wrote: > >> Some questions and tests you can try: > > A thing I've forgotten: I assume you're using SQLObject; you can try > playing with the 'use_unicode', 'charset' and 'sqlobject_encoding' > (this last one deprecated) arguments of the connectionForURI function. > > You can find the call in the setConnection function of the > imdb.parser.sql.objectadapter module (as you can see we already do > some customization needed by MySQL). > > I _think_ you can even pass these parameters from the command line, > with something like: > -u 'postgres://login:passw...@localhost/imdb?charset=utf8' > > but I'm not too sure about the syntax, see SQLObject's documentation. > > By the way, another thing you can try is to switch to SQLAlchemy, > since IMDbPY supports both - just append, to your command line: > -o sqlalchemy > > > HTH, ------------------------------------------------------------------------------ Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf _______________________________________________ Imdbpy-help mailing list Imdbpy-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-help