Hi all, Today i committed duplicate detection to f-spot. It has been a long standing issue [1] that can hopefully be resolved before the next release.
How does it work? ------------------ Basically F-Spot detects duplicates by comparing md5 sums of the image data. When a new image is imported into f-spot, it checks whether an image with the same md5 sum already exists in the photo database. If that is the case (and if you requested not to import duplicates), then f-spot will skip the image, only importing the ones that are not yet in your image library. Because the md5 sum of an image is stored in the database, duplicate detection is quite fast and not affected too much by the size of your database or the number of photo's you're importing. Updating --------- The md5 value is actually stored in the database, therefore an update is needed on existing f-spot databases. We kept in mind that some people have very large databases, and we didn't want to force them into a long process of waiting before the db was completely upgraded. Therefore when you launch f-spot the first time with the duplicates patch, it will change the db schema and create md5 sum creation jobs. These will gradually calculate md5 sums for all images already available in f-spot; in the background, without disturbing the user. Due to this, it make take a while though before duplicate detection is fully operational. Remarks -------- Running the latest version of f-spot from svn will thus change your database schema; we tested it on large databases [2] where the updata happened without too much hassle, but be aware it might take some time. If you want to test it before you run it against your full db, you can always use the -b switch of course. Best Regards, Thomas [1] http://bugzilla.gnome.org/show_bug.cgi?id=169646 [2] http://bugzilla.gnome.org/show_bug.cgi?id=169646#c70 _______________________________________________ F-spot-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/f-spot-list
