Hi!, On mar, 2010-02-09 at 21:31 +0100, Adrien Bustany wrote: > Hi all, > > this proposal is about automatically reindexing a mime type when a new > extractor is added/updated. There's already a "Reindex" call in > tracker-miner-fs.
I think we should take several use cases into account here. * A new tracker-extractor module is installed * As mbiebl pointed out, a tracker-extractor module has been added some new capability. * The library a tracker extractor relies on gains new capabilities (i.e. GStreamer, poppler) IMHO the trickiest one is the 3rd, which either requires integration from packagers, or some way for extractors to probe the file types supported. The second would largely depend on whether the library is able to tells us that, which I don't think happens often, so we might be just forced to the first option. For the second usecase, we clearly need some way to version the extractors, so it is known when to re-extract. The keyfile with version info approach looks quite sane to me, we should provide some command line tool to bump the version number for a given extractor. > > Philip told me he'd like to keep tracker-extract as stupid as possible, > so the logic here would be implemented in tracker-miner-fs, at init time. > All extractors modules provide a function to know the mime types they can > index, but we want to avoid loading all the modules at start. Therefore, > a solution using desktop files is favoured. The desktop files would be > installed in ${datadir}/tracker/extractors and would have the format This makes much sense to me, since the extractor could not run at all if tracker-miner-fs thinks everything is up-to-date. Certainly, having complete info about the mimetypes the extractors knows about in the miner would avoid queries about unsupported files to the extractor, as it happens right now. > > [Extractor] > Name=Foobar extractor > MimeTypes=application/foobar What I propose is, having the mimetypes info separated from the versioning info, I think this way we can provide reasonable support for the 3 usecases mentioned above: * If tracker-extractor-foo 0.15 provides better information than the previous release 0.14, the version file installed by the package bumps its number, tracker-miner-fs notices the version bump and does its job. * if some gstreamer package is added, package scripts use some CLI tracker tool if available to add new mimetypes for the gstreamer extract module, tracker-miner-fs notices the changes in mimetypes supported and does its job. The version number isn't actually changed, since the extract module hasn't changed. * A new extractor is installed, tracker-miner-fs notices no prior info about it and does it's job, more or less like a version bump. So, I guess there should be some $(datadir)/tracker/extractors/ with version info and a $(datadir)/tracker/extractor-mimetypes/ with a mimetype->extractor mapping. The main caveat I see here is how would the initial mimetype mapping be done for certain modules (gstreamer yet again in mind :), this could require yet again packagers help. Besides, we also need to take into account restarting tracker-extract if it's alive at the time of the update, and making things persistent so if tracker-miner-fs shuts down pending file checks wouldn't be lost. This is going to be tricky :) Cheers, Carlos > > at startup, the FS miner loads all the description files and checks if a > new extractor has been added, removed, or changed its mtime. If > so, it calls the reindex method with the appropriate mime type. > To detect a change in a desktop file, a list of each desktop file with > their > modification time is kept in cache by the FS miner. > > Ideas : > - Describe several extractors in one file > That makes is much more difficult to detect a change, since only one > extractor > in a file listing 10 might have changed when the file modification time > changes. > - Adding a version number in the desktop file, to avoid relying only on > the mtime > of the desktop file. > > Please tell me your thoughts > > Adrien > _______________________________________________ > tracker-list mailing list > tracker-list@gnome.org > http://mail.gnome.org/mailman/listinfo/tracker-list _______________________________________________ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list