On 12/31/06, Justin Wetherell <[EMAIL PROTECTED]> wrote:
> I've redone my previous work using two sql statements instead of an
> iteration; i've also implemented a removeDuplicate for use when a
> scheduled recording is removed.


Neat. I think, though, one SQL result set which is iterated over may
be more flexible: sometimes subtitles and descriptions are close, but
not quite exactly identical.

For example, upcoming on BBC 3 are episodes of Torchwood with
subtitles of "12&13/13" and "12 & 13/13" - note the extra space.

MythTV seems to get this better, and looping over the result set would
mean we could do some smarter checking.

Having said that, Myth's scheduler[1] doesn't seem to do anything more
clever (and uses a single SQL statement) so must be resolving the
differences elsewhere. Smarter matching could be a good win here, for
example:

   * Check subtitle and description without spaces.
   * Use string similarity comparison, perhaps only when length > SOME_NUM[2].

Cheers,

Andrew

[1] 
http://svn.mythtv.org/trac/browser/trunk/mythtv/programs/mythbackend/scheduler.cpp
[2] http://www.personal.psu.edu/iua1/python/apse/
 or http://trific.ath.cx/resources/python/levenshtein/
    (implementation in pure Python in editdist() in http://tinyurl.com/yjxqnv)

-- 
Andrew Flegg -- mailto:[EMAIL PROTECTED]  |  http://www.bleb.org/

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Freevo-devel mailing list
Freevo-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-devel

Reply via email to