On 09/10/11 23:33, Age Bosma wrote:
Hi,
Hi,
As far as I understand it is that Tracker currently only sticks to
collecting meta-data which can be retrieved from the actual files.
Would it be an idea to extend this concept by supplementing the metadata
which could not be determined from a file with info from external
resources? And/or intelligent guessing for that matter?
E.g. we have a movie with no tags like title, director, etc. We do have
a file name though.
In a lot of cases the movie title can be subtracted from it. This could
be added to the tracker metadata, followed by requesting the director of
a movie from an external resource like IMDB.
The same would go for the language of a file like a movie or subtitle,
where a language code is included in a file name.
A different approach can be taken with music. A audio fingerprint can be
determined, followed by using that to retrieve the additional meta-data
from MusicBrainz.
The external and guessed meta-data should be stored separate from the
normal meta-data stored by Tracker, marked as an external
title/director/... tag.
It is then up to an application to decide what to use. I.e. normal title
present? Use it. No title present but an external title present? Use
that one if you like.
Why would one want this? Often more info than can be extracted from
files is appreciated. It will prevent applications from having to
reinvent the wheel, deviating from Tracker as their meta-data source
because it does not have the information.
E.g. Rygel could start listing movies on a TV with the actual movie
title instead of using file names or list them by director even though
no tags where present. Banshee (if/when they start using Tracker) does
not have to maintain their own MusicBrainz query service because Tracker
already provides the information.
There are a number of issues here. What springs to mind is:
- Do we write back the data to the file itself (I would like to see
that, but support there is limited right now by file type)?
- Guessing metadata based on filename, etc is currently build time
optional. Part of me wonders if this should be in the
tracker-preferences dialog somewhere so users can configure this more
dynamically. Part of me thinks it's not useful though. Perhaps a silent
configuration not in the UI is more preferred.
Does functionality as described above fit within the goals/scope of tracker?
It certainly does.
Would there be any objections again going into this direction?
Not at all.
Does tracker allow extending functionality as described above?
Yes and no. You could write a miner as suggested, but I feel this is not
the right approach. While the name "miner" makes sense, what we're doing
here is more "post-processing" and we've considered having some daemon
to go around cleaning up classes and information which can be derived
from content inserted by miner-fs or applications. A couple of examples
here are:
- You insert a contact for an email, you delete the email, the contact
then stay around. Really shouldn't the contact be removed? It does
depend on who uses it (the graph) but if it is just there for the email,
it should be removed ideally. If some gnome-contacts or other
application makes use of it using their graph to insert the data, we
wouldn't clean it up.
- You want an album's total duration in time inserted and removed when
albums appear or are deleted. Right now we have no way to do this.
- As you say, cleaning up the titles and other information from an
external source like IMDB. I would love to see this by-the-way.
--
I guess you could write a miner to do this. It would listen to graph
update signals to know when to find out about new music/videos and
update the store.
You could also write this into tracker-extract/libtracker-extract and
have some common functions to get this information. But you will quickly
run into interesting conditions like: What do we do when you have no
Internet connection? I suppose the Flickr and other miners have had to
deal with this so we have infrastructure there for that.
Does the current shared-filemetadata-spec provide a way to store
information as external/additional?
No, not AFAICS. Actually the spec looks quite under-defined and ill
considered in places. I don't know how up to date it is or if it's even
finished. It certainly doesn't mention how storage of said data should
be handled AFAICS.
--
Regards,
Martyn
Founder and CEO of Lanedo GmbH.
_______________________________________________
tracker-list mailing list
tracker-list@gnome.org
http://mail.gnome.org/mailman/listinfo/tracker-list