Hi Andrew,

El lun, 14-07-2008 a las 15:37 -0700, ext Andrew Leung escribió:
> I would like to utilize Tracker's file content extraction mechanism  
> within my own program. Basically I would like to be able to parse  
> various file types and pull out keywords. Does Tracker have any  
> mechanism (API/separate program) that can I can use to pull content  
> from various file types?

 The content is extracted using the scripts
in /usr/local/lib/tracker/filters/ . These scripts are organized
following the mimetype name, and usually they call external programs to
extract the contents (like wv, pdftotext, ...)

 Tracker obtains the mime-type of the file, decides the category and if
the category "Has full text", calls one of those scripts.


> Beagle search has a program called 'beagle-extract-content' that I  
> have been using for this purpose though I haven't been particularly  
> happy with it. Thanks a lot.

 We have a "tracker-extractor" program. It extracts the _metadata_ of
the file (not the contents). Maybe it is also useful for you.

 Any improvement in the filters/extractors is welcome ;)

 Regards,

Ivan


_______________________________________________
tracker-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/tracker-list

Reply via email to