Hi, On Fri, 2006-11-17 at 14:42 +0100, Michal Pryc wrote: > So if anyone can tell me how is with the supported data types?
It's important to separate data sources from file types. A data source can (in theory) produce items of any file type. For example, the file system data source produces text files, image files, PDFs, etc. In your list, you are combining the two. Email data sources produce not only RFC 822 emails, but also attachments of any file type contained inside those emails. Some data sources produce only one file type: the Gaim log backend, for example, produces only Gaim logs. For all of the data sources, there is custom code to extract the data. Only the Evolution Data Server backend uses an external library. It uses evolution-sharp, which wraps the evolution-data-server C APIs in C#. We extract metadata from all of the supported file types, and extract full text from all supported file types that have it. In almost all cases these are handled at the same time, by the same code. For the file types, many of them have custom code for parsing document types. I'm only going to list the ones that we use special libraries or external programs to parse: * Emails - gmime-sharp * MS Word - wv1, optionally gsf-sharp * MS Excel - An external program from gnumeric called ssindex. * MS Powerpoint - gsf-sharp * PDF - We run the external pdfinfo and pdftotext programs from xpdf and parse the output * HTML - A modified HtmlAgilityPack included in the Beagle source tree * Windows help files (chm) - chmlib * Image files - custom code, mostly copied from F-Spot * Audio files - entagged-sharp, included in the Beagle source tree. Plans are to move to taglib-sharp, which is what Banshee uses. * Video files - Either external programs from MPlayer or Totem. * RPM - The rpm program itself. The full list of filters in the source tree is here: http://cvs.gnome.org/viewcvs/beagle/Filters/ You can get more information on the specific .Net namespaces (ie, System.Xml) from there. (Hint: grep for "using") Joe _______________________________________________ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers