The following module was proposed for inclusion in the Module List:
modid: Digest::ManberHash
DSLIP: bdcOg
description: Estimating similariness in files
userid: PMAREK (Philipp Marek)
chapterid: 17 (Archiving_and_Compression)
communities:
similar:
String::Similarity String::Approx
rationale:
This module gives a number of hash values for any given file; this
hash values can be used to compare files and get a value telling
about similariness.
As this is not a single value per file it can't be replaced by MD5,
SHA-1, or other cryptographic hashes.
The difference between String::Similarity, String::Approx and this
module is that this module may be used to compare BIG files.
String::Similarity and String::Approx are (AFAIU) approx. O(N*M),
where Digest::ManberHash is only O(N+M) (with N and M the size of
the compared objects); but Digest::ManberHash works only for bigger
data sets.
For details please see http://manber.com/publications.html or
ftp://ftp.cs.arizona.edu/reports/1993/TR93-33.ps
enteredby: PMAREK (Philipp Marek)
enteredon: Tue Aug 19 12:37:51 2003 GMT
The resulting entry would be:
Digest::
::ManberHash bdcOg Estimating similariness in files PMAREK
Thanks for registering,
--
The PAUSE
PS: The following links are only valid for module list maintainers:
Registration form with editing capabilities:
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=a0400000_991065f3581374b9&SUBMIT_pause99_add_mod_preview=1
Immediate (one click) registration:
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=a0400000_991065f3581374b9&SUBMIT_pause99_add_mod_insertit=1