On 1/1/07, Andrew Flegg <[EMAIL PROTECTED]> wrote:
On 12/31/06, Justin Wetherell <[EMAIL PROTECTED]> wrote:
> I've redone my previous work using two sql statements instead of an
> iteration; i've also implemented a removeDuplicate for use when a
> scheduled recording is removed.

Neat. I think, though, one SQL result set which is iterated over may
be more flexible: sometimes subtitles and descriptions are close, but
not quite exactly identical.

Attached is my variant of your first patch, it needs to include the
removeDuplicate() function, but I think it's the best approach:

  * Null values don't cause false positives, and don't get stored to the
    database as the string "null", instead the value NULL (a bug in
    dbutil.escape())

  * Duplication is detected on the lower case, space-stripped version of
    subtitle & description (Levenshtein distance didn't prove too useful)

Comments appreciated. In particular, I think with your use of
dbutil.escape() we need to modify it so it returns "'sql'" or "null",
otherwise you get the string "null" stored in the DB. Nothing on the
core seems to use the function, so modifying it to wrap non-null
values with single quotes and change the use in recordserver would
*seem* to be OK?

Cheers,

Andrew

--
Andrew Flegg -- mailto:[EMAIL PROTECTED]  |  http://www.bleb.org/

Attachment: dedupe-ajf.patch
Description: Binary data

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Freevo-devel mailing list
Freevo-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-devel

Reply via email to