Re: [Freevo-users] IMDB issues

2006-10-18 Thread Dirk Meyer
Karl Lattimer wrote:
> I've actually already done some work porting apple trailers to using SAX
> instead of regexp parsing. By making appletrailers use SAX its a lot
> easier to maintain the code. Right now its not perfect, and if apple
> continue to move things over to itms then it'll be impossible to
> aggregate anyway in future but its still worth hacking around on for the
> moment.
>
> If I can develop appletrailers SAX further and get IMDB working at the
> same time that _WOULD_ be a nice patch to get into 1.6

What about BeautifulSoup (I'm offline while writing this mail, ask
Google for an url). BeautifulSoup is very robust against invalid
sgml/xml (e.g. missing tags) and sites changes may not have affects so
often. But that would be a new dependency.

For future development I suggest putting stuff like that into
kaa.netsearch (the name sucks). The idea is to put everything that
requires web parsing into one kaa module with a fixed interface. On
changes, just release a new kaa.netsearch. Maybe kaa.spider would be a
good name for web based modules. But the usage of basic kaa stuff is
for 1.7.


Dischi

-- 
You might have mail.


pgpEphPedV3rm.pgp
Description: PGP signature
-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Freevo-users mailing list
Freevo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-users


Re: [Freevo-users] IMDB issues

2006-10-18 Thread Dirk Meyer
Jason Tackaberry wrote:
> On Tue, 2006-10-17 at 14:31 +0100, Karl Lattimer wrote:
>> Has anyone else been experiencing problems where the IMDB lookup
>> results in an FXD file being created with a blank title? The result is
>> that the file goes to the start of the list. There should probably be
>
> imdb probably changed their html again.
>
> Ideally, we would have a small file that exists on freevo.org that
> net-connected freevo boxen can pull periodically to update to new code
> that fixes issues like this that are sporadically temperamental.

Or create an extra kaa module just for web based stuff and we update
it on web site changes.


Dischi

-- 
2 + 2 = 5  for sufficiently large values of 2.


pgpmnrCuJC0YS.pgp
Description: PGP signature
-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Freevo-users mailing list
Freevo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-users


Re: [Freevo-users] IMDB issues

2006-10-17 Thread Karl Lattimer
Here's an intermediate fix.

If Duncan Would like to put it into the SVN and get it into the next
release then that'd be great.

PS. Sorry its not a diff, I didn't have time to faff on, I just wanted
it fixed.

FYI - There are better ways to process HTML

To name but a few;
http://www.crummy.com/software/BeautifulSoup/documentation.html
http://docs.python.org/lib/module-xml.sax.html
http://docs.python.org/lib/module-xml.dom.html

Using regexp is a painful way of doing it, in this instance the file is
read in line by line and parsed using regexp which is likely to require
updating over time. Using SAX the file would still have parsed
correctly. However with the construction of the regular expressions
which processed the HTML line by line the fact there was a line split
part of the way through the expression made it completely imparsable. I
switched the regexp to use the page title instead of the title text and
it now works nicely. 

This needs to be re-written as does anything else that uses re for HTML
into SAX which is easier to understand and easier to maintain. Web
services like RSS/ATOM and gadgetty things like IMDB and apple trailers
enhance freevo tremendously but they can't do so if it keeps breaking. 

When freevo 2.0 is released I may look seriously into producing SAX
parsers which can provide these bits of functionality.

K,
# -*- coding: iso-8859-1 -*-
# ---
# helpers/fxdimdb.py - class and helpers for fxd/imdb generation
# ---
# $Id: fxdimdb.py 6939 2005-01-09 10:29:17Z dischi $
#
# Notes: see http://pintje.servebeer.com/fxdimdb.html for documentatio,
# Todo: 
# - add support making fxds without imdb (or documenting it)
# - webradio support?
#
# ---
# $Log$
# Revision 1.7.2.1  2005/01/09 10:29:17  dischi
# make imdb work again
#
# Revision 1.7  2004/07/10 12:33:42  dischi
# header cleanup
#
# Revision 1.6  2004/06/20 13:06:20  dischi
# move freevo-rebuild-database to cache dir
#
# ---
# Freevo - A Home Theater PC framework
# Copyright (C) 2003 Krister Lagerstrom, et al.
# Please see the file freevo/Docs/CREDITS for a complete list of authors.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of MER-
# CHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
# Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc.,
# 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
#
# --- */


# python has no data hiding, but this is the intended use...
# subroutines completly in lowercase are regarded as more "private" functions
# subRoutines are regarded as public

#some data
__author__ = "den_RDC ([EMAIL PROTECTED])"
__version__ = "Revision 0.1"
__copyright__ = "Copyright (C) 2003 den_RDC"
__license__ = "GPL"

#Module Imports
import re
import urllib, urllib2, urlparse
import sys
import codecs
import os

import config 
import util

from mmpython.disc.discinfo import cdrom_disc_id
#Constants

freevo_version = '1.3.4'

imdb_title_list = '/tmp/imdb-movies.list'
imdb_title_list_url = 'ftp://ftp.funet.fi/pub/mirrors/ftp.imdb.com/pub/movies.list.gz'
imdb_titles = None
imdb_info_tags = ('year', 'genre', 'tagline', 'plot', 'rating', 'runtime');


# headers for urllib2
txdata = None
txheaders = {   
'User-Agent': 'freevo %s (%s)' % (freevo_version, sys.platform),
'Accept-Language': 'en-us',
}

#Begin class

class FxdImdb:
"""Class for creating fxd files and fetching imdb information"""

def __init__(self):
"""Initialise class instance"""

# these are considered as private variables - don't mess with them unless
# no other choise is given
# fyi, the other choice always exists : add a subroutine or ask :)

self.imdb_id_list = []
self.imdb_id = None
self.isdiscset = False
self.title = ''
self.info = {}

self.image = None # full path image filename
self.image_urls = [] # possible image url list
self.image_url = None # final image url 

self.fxdfile = None # filename, full path, WITHOUT extension

self.append = False
self.device = None
self.regexp = None
self.mpl_global_opt = None
self.media_id = None
self.file_opts = []
  

Re: [Freevo-users] IMDB issues

2006-10-17 Thread Karl Lattimer






Ideally, we would have a small file that exists on freevo.org that
net-connected freevo boxen can pull periodically to update to new code
that fixes issues like this that are sporadically temperamental.


Wouldn't it be better to pull a small text config file from the web. 

For instance;

the element title will probably be something like 

container = tag, class, id
element = container, tag, class, id

so

container = div, NONE, header
element = container, a, title, NONE

or something to that effect. I'm just dumping FFR.

K,


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Freevo-users mailing list
Freevo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-users


Re: [Freevo-users] IMDB issues

2006-10-17 Thread Karl Lattimer




On Tue, 2006-10-17 at 15:32 +0100, John Molohan wrote:


Jason Tackaberry wrote:
> On Tue, 2006-10-17 at 14:31 +0100, Karl Lattimer wrote:
>   
>> Has anyone else been experiencing problems where the IMDB lookup
>> results in an FXD file being created with a blank title? The result is
>> that the file goes to the start of the list. There should probably be
>> 
>
> imdb probably changed their html again.
>
> Ideally, we would have a small file that exists on freevo.org that
> net-connected freevo boxen can pull periodically to update to new code
> that fixes issues like this that are sporadically temperamental.
>   
Nice idea, could work for web radio/tv and appletrailers too. I might 
stick that in the feature request tracker.


I've actually already done some work porting apple trailers to using SAX instead of regexp parsing. By making appletrailers use SAX its a lot easier to maintain the code. Right now its not perfect, and if apple continue to move things over to itms then it'll be impossible to aggregate anyway in future but its still worth hacking around on for the moment.

If I can develop appletrailers SAX further and get IMDB working at the same time that _WOULD_ be a nice patch to get into 1.6

K,




-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Freevo-users mailing list
Freevo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-users





-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Freevo-users mailing list
Freevo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-users


Re: [Freevo-users] IMDB issues

2006-10-17 Thread Duncan Webb
Karl Lattimer wrote:
> I might take a look at it tonight and see if its something simple and
> dumb, Is it too late to get a patch into 1.6?

Bug fixed, can go in no problem. I think that this is a bug.

Will allow new plug-ins too, if they don't work then nobody has lost or
gained anything, at least until the end of this week.

Additions that could have a core impact will go into 1.x instead of 1.6.

Duncan

> 
> K,
> 
> On Tue, 2006-10-17 at 10:12 -0400, Jason Tackaberry wrote:
>> On Tue, 2006-10-17 at 14:31 +0100, Karl Lattimer wrote:
>> > Has anyone else been experiencing problems where the IMDB lookup
>> > results in an FXD file being created with a blank title? The result is
>> > that the file goes to the start of the list. There should probably be
>>
>> imdb probably changed their html again.
>>
>> Ideally, we would have a small file that exists on freevo.org that
>> net-connected freevo boxen can pull periodically to update to new code
>> that fixes issues like this that are sporadically temperamental.
>>
>>
>>
>> -
>> Using Tomcat but need to do more? Need to support web services, security?
>> Get stuff done quickly with pre-integrated technology to make your job easier
>> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
>> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 
>> 
>> ___
>> Freevo-users mailing list
>> Freevo-users@lists.sourceforge.net 
>> 
>> https://lists.sourceforge.net/lists/listinfo/freevo-users
>>
> 
> 
> 
> -
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> 
> 
> 
> 
> ___
> Freevo-users mailing list
> Freevo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/freevo-users



-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Freevo-users mailing list
Freevo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-users


Re: [Freevo-users] IMDB issues

2006-10-17 Thread John Molohan
Jason Tackaberry wrote:
> On Tue, 2006-10-17 at 14:31 +0100, Karl Lattimer wrote:
>   
>> Has anyone else been experiencing problems where the IMDB lookup
>> results in an FXD file being created with a blank title? The result is
>> that the file goes to the start of the list. There should probably be
>> 
>
> imdb probably changed their html again.
>
> Ideally, we would have a small file that exists on freevo.org that
> net-connected freevo boxen can pull periodically to update to new code
> that fixes issues like this that are sporadically temperamental.
>   
Nice idea, could work for web radio/tv and appletrailers too. I might 
stick that in the feature request tracker.

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Freevo-users mailing list
Freevo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-users


Re: [Freevo-users] IMDB issues

2006-10-17 Thread Karl Lattimer




I might take a look at it tonight and see if its something simple and dumb, Is it too late to get a patch into 1.6?

K,

On Tue, 2006-10-17 at 10:12 -0400, Jason Tackaberry wrote:


On Tue, 2006-10-17 at 14:31 +0100, Karl Lattimer wrote:
> Has anyone else been experiencing problems where the IMDB lookup
> results in an FXD file being created with a blank title? The result is
> that the file goes to the start of the list. There should probably be

imdb probably changed their html again.

Ideally, we would have a small file that exists on freevo.org that
net-connected freevo boxen can pull periodically to update to new code
that fixes issues like this that are sporadically temperamental.



-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Freevo-users mailing list
Freevo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-users





-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Freevo-users mailing list
Freevo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-users


Re: [Freevo-users] IMDB issues

2006-10-17 Thread Jason Tackaberry
On Tue, 2006-10-17 at 14:31 +0100, Karl Lattimer wrote:
> Has anyone else been experiencing problems where the IMDB lookup
> results in an FXD file being created with a blank title? The result is
> that the file goes to the start of the list. There should probably be

imdb probably changed their html again.

Ideally, we would have a small file that exists on freevo.org that
net-connected freevo boxen can pull periodically to update to new code
that fixes issues like this that are sporadically temperamental.



-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Freevo-users mailing list
Freevo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-users


[Freevo-users] IMDB issues

2006-10-17 Thread Karl Lattimer




Has anyone else been experiencing problems where the IMDB lookup results in an FXD file being created with a blank title? The result is that the file goes to the start of the list. There should probably be some kind of fall back solution to this based on the filename, if the title field returns blank as it is.

Any suggestions welcome,

K,


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Freevo-users mailing list
Freevo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-users


Re: [Freevo-users] IMDB issues

2006-10-17 Thread Chris Thomas
I'm experiencing this as well. I manually had to add in the title to
some movies last night.

-Chris

On 10/17/06, Karl Lattimer <[EMAIL PROTECTED]> wrote:
>
>  Has anyone else been experiencing problems where the IMDB lookup results in
> an FXD file being created with a blank title? The result is that the file
> goes to the start of the list. There should probably be some kind of fall
> back solution to this based on the filename, if the title field returns
> blank as it is.
>
>  Any suggestions welcome,
>
>  K,
> -
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job
> easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
>
> ___
> Freevo-users mailing list
> Freevo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/freevo-users
>
>
>

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Freevo-users mailing list
Freevo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-users