On Jan 10, Mike Castle <dalg...@gmail.com> wrote:

> Note:
> http:  u'The Motorcycle Diaries::(USA)'
> sql: u'Motorcycle Diaries, The (2004)::(USA)'
> 
> So, I wrote a routine to normalize those to get rid of ::(USA) and
> (2004)::(USA), great.
> 
> But now I have to move that damned article around too?!?

You can use the mighty functions provided by IMDbPY! ;-)
They are in the 'utils' module.

Once you have a title without the year and other funny things,
you can use the normalizeTitle function to turn every known
article around (I've done a more or less extensive research on
the data we're working on to identify articles - a thing that
is more art than science, as you can guess).

Notice that there are cases where your function can return
the wrong thing (I'm not sure about akas, but the year can
be something like (1999/II), meaning that this is the second
movie with the same title produced that year).
So your best hope, besides Obi-Wan Kenobi, is something like:

  from imdb import utils
  title = u'Motorcycle Diaries, The (2004)'
  tdict = utils.analyze_title(title, canonical=1)
  straight_title = utils.normalizeTitle(tdict['title'])

>    (aka Tagebuch der Lust, Teil 2 (1999) (TV))  (Germany)
> 
> It looks like different languages have different ways of moving the
> article to the end of the string.  Interesting, I never knew that.

"Teil" is not an article in Dutch, YOU INSENSITIVE ENGLISH-CENTRIC
AMERICAN!!! :-D
It means "part", for instance. :-)


HTH,
-- 
Davide Alberani <davide.alber...@gmail.com> [PGP KeyID: 0x465BFD47]
http://erlug.linux.it/~da/

------------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It is the best place to buy or sell services for
just about anything Open Source.
http://p.sf.net/sfu/Xq1LFB
_______________________________________________
Imdbpy-help mailing list
Imdbpy-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-help

Reply via email to