Re: [Tracker] Indexing mbox files
On 2010-08-23 10:08, Philip Van Hoof wrote: On Wed, 2010-08-18 at 18:36 +0200, Mildred Ki'Lya wrote: On 2010-08-18 18:16, Mildred Ki'Lya wrote: Additionally, how do I generate the URI for the different objects? As described here: http://live.gnome.org/Tracker/Documentation/Examples/SPARQL/Email#Storage quote To get a good scheme for forming URLs for IMAP, read RFC 5092. The FETCH command in IMAP explains how to use sections at page 55. For a URL scheme for POP read RFC 2384. Avoid inventing your own URL scheme. You want to use these URLs for the value of nie:url in RDF. You can also use them for the subjects of your resources, which is what I will do in the examples that follow. /quote Hi, I'm not going to use the IMAP URI, I just can't. The messages T'm indexing are on the local hard drive on mbox files. I don't have an IMAP server to attach those messages to. They are archives and are probably not on any IMAP or POP server anywhere. And some of those messages might have never been on any IMAP or POP server. Mildred -- Mildred Ki'Lya ╭─ mildred593@online.fr ── │ Jabber, GoogleTalk: mild...@jabber.fr │ Website: http://ki.lya.online.fr GPG ID: 9A7D 2E2B │ Fingerprint: 197C A7E6 645B 4299 6D37 684B 6F9D A8D6 9A7D 2E2B 0x9A7D2E2B.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature ___ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list
Re: [Tracker] Indexing mbox files
Hi, Would you be interested in an extractor that extract information from mailboxes written in Vala ? If you are, I'll have to ask permission from my employer but I'm sure it could be done. (For the moment, it doesn't do MIME multipart nor threading since that wasn't useful for my application) I prefer the approach of file extractors to the applications plugin that updates the tracker database. The file extractors work for any filetype you support while plugins have to be rewritten for every application. For example, I don't use evolution, and my mail client changed from sylpheed to claws-mail and now Thunderbird and perhaps soon Balsa. Writing a plugin for each is a waste of time, especially considering they all use standardised formats. Recently, I just realized that tracker was a wonderful thing. You could write an e-mail reader that should just use tracker to find all the e-mails you have on your machine. The same way, we could hae a contacts application that would let you see all contacts everywhere on your machine, vCards on your ~/Documents, or sent to you as e-mail attachments... There is no limitation. Truly interoperable I'd love such a desktop, than you guys to make this possible! Mildred -- Mildred Ki'Lya ╭─ mildred593@online.fr ── │ Jabber, GoogleTalk: mild...@jabber.fr │ Website: http://ki.lya.online.fr GPG ID: 9A7D 2E2B │ Fingerprint: 197C A7E6 645B 4299 6D37 684B 6F9D A8D6 9A7D 2E2B ___ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list
Re: [Tracker] Indexing mbox files
On Tue, 2010-08-24 at 14:04 +0200, Mildred Ki'Lya wrote: Hi, Would you be interested in an extractor that extract information from mailboxes written in Vala ? Yes If you are, I'll have to ask permission from my employer but I'm sure it could be done. Great. For extractors to go upstream we require them to be licensed as GPL. You don't need to reassign copyright ownership to us. (For the moment, it doesn't do MIME multipart nor threading since that wasn't useful for my application) OK, MIME multipart would be very useful, though. I prefer the approach of file extractors to the applications plugin that updates the tracker database. The file extractors work for any filetype you support while plugins have to be rewritten for every application. OK For example, I don't use evolution, and my mail client changed from sylpheed to claws-mail and now Thunderbird and perhaps soon Balsa. Writing a plugin for each is a waste of time, especially considering they all use standardised formats. Yes, sure. Recently, I just realized that tracker was a wonderful thing. You could write an e-mail reader that should just use tracker to find all the e-mails you have on your machine. Indeed The same way, we could hae a contacts application that would let you see all contacts everywhere on your machine, vCards on your ~/Documents, or sent to you as e-mail attachments... There is no limitation. :-) Truly interoperable Awesome that we have a new app developer who undoubtedly is going to make a lot of software like you just described earlier. I'd love such a desktop, thank you guys to make this possible! No problem. Thank you for your enthusiasm. You just made our day. Cheers, Philip -- Philip Van Hoof freelance software developer Codeminded BVBA - http://codeminded.be ___ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list
Re: [Tracker] Indexing mbox files
On Wed, 2010-08-18 at 18:36 +0200, Mildred Ki'Lya wrote: On 2010-08-18 18:16, Mildred Ki'Lya wrote: So, if I understood everything well, I should have something like: nmo:Mailbox - nie:hasPart - nmo:MailboxDataObject - nie:interpretedAs - nmo:Email and of course nmo:Email - nie:isStoredAs - nmo:MailboxDataObject - nie:isPartOf - nmo:Mailbox Additionally, how do I generate the URI for the different objects? As described here: http://live.gnome.org/Tracker/Documentation/Examples/SPARQL/Email#Storage quote To get a good scheme for forming URLs for IMAP, read RFC 5092. The FETCH command in IMAP explains how to use sections at page 55. For a URL scheme for POP read RFC 2384. Avoid inventing your own URL scheme. You want to use these URLs for the value of nie:url in RDF. You can also use them for the subjects of your resources, which is what I will do in the examples that follow. /quote For nmo:Mailbox, this is just the filename (uri provided by Tracker). But for nmo:MailboxDataObject and nmo:Email, what are the most appropriate URI? I thought of using file-uri#n for MailboxDataObject and urn:email:message-id for Email. Do you think it is appropriate? Mildred PS: I'm writing the plugin using Vala, I prefer to plain C :) Sure Cheers, Philip -- Philip Van Hoof freelance software developer Codeminded BVBA - http://codeminded.be ___ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list
Re: [Tracker] Indexing mbox files
On 2010-08-18 18:16, Mildred Ki'Lya wrote: So, if I understood everything well, I should have something like: nmo:Mailbox - nie:hasPart - nmo:MailboxDataObject - nie:interpretedAs - nmo:Email and of course nmo:Email - nie:isStoredAs - nmo:MailboxDataObject - nie:isPartOf - nmo:Mailbox Additionally, how do I generate the URI for the different objects? For nmo:Mailbox, this is just the filename (uri provided by Tracker). But for nmo:MailboxDataObject and nmo:Email, what are the most appropriate URI? I thought of using file-uri#n for MailboxDataObject and urn:email:message-id for Email. Do you think it is appropriate? Mildred PS: I'm writing the plugin using Vala, I prefer to plain C :) -- Mildred Ki'Lya ╭─ mildred593@online.fr ── │ Jabber, GoogleTalk: mild...@jabber.fr │ Website: http://ki.lya.online.fr GPG ID: 9A7D 2E2B │ Fingerprint: 197C A7E6 645B 4299 6D37 684B 6F9D A8D6 9A7D 2E2B 0x9A7D2E2B.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature ___ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list
Re: [Tracker] Indexing mbox files
Le Wed, 18 Aug 2010 18:36:40 +0200, Mildred Ki'Lya mildred...@gmail.com a écrit : On 2010-08-18 18:16, Mildred Ki'Lya wrote: So, if I understood everything well, I should have something like: nmo:Mailbox - nie:hasPart - nmo:MailboxDataObject - nie:interpretedAs - nmo:Email and of course nmo:Email - nie:isStoredAs - nmo:MailboxDataObject - nie:isPartOf - nmo:Mailbox Additionally, how do I generate the URI for the different objects? You can use update_blank with an anonymous node, Tracker will generate a urn for you. Cheers Adrien For nmo:Mailbox, this is just the filename (uri provided by Tracker). But for nmo:MailboxDataObject and nmo:Email, what are the most appropriate URI? I thought of using file-uri#n for MailboxDataObject and urn:email:message-id for Email. Do you think it is appropriate? Mildred PS: I'm writing the plugin using Vala, I prefer to plain C :) ___ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list
[Tracker] Indexing mbox files
Hi, I am in the process of writing an extractor for mbox files. My purpose is to use Tracker as an internal search engine for another application, and this application stores conversations in mbox files with specific headers I need to parse. Additionally, it could be used to successfully search within e-mails when you don't use evolution. And, I have the following problems, I thought you guys could help me. First (but this isn't important at all), libmagic report mbox files as text/plain instead of application/mbox. So I have to make my extractor for the two mime types and filter afterwards. Then, I started looking at ontologies, and I had a hard time figuring everything out. I want to register both the mailbox and the messages within it. I started out reading the NMO (Nepomuk Message Ontology) to find that there ware no relations between a nmo:Mailbox and a nmo:Email. I had to look at the NIE (Nepomuk Information Element) to construct the relations between my ontologies. So, if I understood everything well, I should have something like: nmo:Mailbox - nie:hasPart - nmo:MailboxDataObject - nie:interpretedAs - nmo:Email and of course nmo:Email - nie:isStoredAs - nmo:MailboxDataObject - nie:isPartOf - nmo:Mailbox It seems rather complex to me and I wanted to know if there was a simpler way to doing things. Thanks, Mildred -- Mildred Ki'Lya ╭─ mildred593@online.fr ── │ Jabber, GoogleTalk: mild...@jabber.fr │ Website: http://ki.lya.online.fr GPG ID: 9A7D 2E2B │ Fingerprint: 197C A7E6 645B 4299 6D37 684B 6F9D A8D6 9A7D 2E2B 0x9A7D2E2B.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature ___ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list
Re: [Tracker] Indexing mbox files
Here you can find a complete explanation of how to go from MIME to RDF: http://live.gnome.org/Tracker/Documentation/Examples/SPARQL/Email On Wed, 2010-08-18 at 18:16 +0200, Mildred Ki'Lya wrote: Hi, I am in the process of writing an extractor for mbox files. My purpose is to use Tracker as an internal search engine for another application, and this application stores conversations in mbox files with specific headers I need to parse. Additionally, it could be used to successfully search within e-mails when you don't use evolution. And, I have the following problems, I thought you guys could help me. First (but this isn't important at all), libmagic report mbox files as text/plain instead of application/mbox. So I have to make my extractor for the two mime types and filter afterwards. Then, I started looking at ontologies, and I had a hard time figuring everything out. I want to register both the mailbox and the messages within it. I started out reading the NMO (Nepomuk Message Ontology) to find that there ware no relations between a nmo:Mailbox and a nmo:Email. I had to look at the NIE (Nepomuk Information Element) to construct the relations between my ontologies. So, if I understood everything well, I should have something like: nmo:Mailbox - nie:hasPart - nmo:MailboxDataObject - nie:interpretedAs - nmo:Email and of course nmo:Email - nie:isStoredAs - nmo:MailboxDataObject - nie:isPartOf - nmo:Mailbox It seems rather complex to me and I wanted to know if there was a simpler way to doing things. Thanks, Mildred ___ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list -- Philip Van Hoof freelance software developer Codeminded BVBA - http://codeminded.be ___ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list