----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/112712/#review41434 -----------------------------------------------------------
>> Check the resource is non-empty before merging it > How can the resource be empty ? Every e-mail has at least a title and plain > text content. Do you want me to check that the > e-mail is actually valid and > not an empty e-mail (a corrupted MIME file for instance) ? Yup, that's right. Experience shows that any possible corrupt file will be out there somewhere. - Simeon Bird On Oct. 9, 2013, 9:06 a.m., Denis Steckelmacher wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://git.reviewboard.kde.org/r/112712/ > ----------------------------------------------------------- > > (Updated Oct. 9, 2013, 9:06 a.m.) > > > Review request for Nepomuk. > > > Repository: nepomuk-core > > > Description > ------- > > This patch adds three new files extractors to Nepomuk. Two of them are of > general use, and the third (that can be removed if it hasn't its place in > Nepomuk) is specific to the use-case described in > http://steckdenis.be/post-2013-09-06-a-nepomuk-integration-plugin-for-firefox.html > . > > The MIME/mbox file extractor takes an mbox file or MIME files (as found in > Maildir directory trees) and index them as NMO:Message objects. The full > content of the e-mails is indexed along with their title, sender, receiver, > CC/BCC, date and message ID. NCO:Contacts and NCO:EmailAddress are created > when needed. The main use of this indexer is to index e-mails managed by > mutt, Thunderbird or any other e-mail client that does not use Akonadi. > > This indexer is a bit special because it also queries the Nepomuk server. > mbox files are typically huge, and change every time the user adds or removes > a mail from it. This can cause many re-indexing operations, and as the file > is big, every indexing operation can take quite a long time. To fasten the > process, the file indexer tries to find already-indexed e-mails with the same > messageID as the e-mails to be indexed. If a mail was already indexed, it is > skipped. This reduces the amount of data transferred to the Nepomuk server > (the full text of the mail doesn't have to be sent to the server only for it > to detect a duplicate message), and a mbox file that took several minutes to > index now only requires a couple of seconds. > > The vCard indexer parses vCard files using the KABC library and stores every > information found in them in NCO:Contact objects. vCard files containing more > than one contact are supported. This allows users to export their contacts > from a webmail or a contact-management application, and to have them indexed > in Nepomuk. > > The last indexer reads .webaction files, that consist of one line describing > the action "DOWNLOAD", then one parameter per line. This file indexer is used > by the Nepomuk Integration plugin for Firefox, that uses this kind of file to > establish a link between a downloaded file and its original location on the > Internet. If you don't want such a specific file indexer to be part of the > Nepomuk Libraries, it can be removed from this patch. > > All these file indexers create resources but don't touch the indexed file > itself. The reason is that a mbox file is not an e-mail, a vCard file is not > a contact (it describes a contact), and also that these files can be > temporary (for instance, the Firefox add-on creates a temporary MIME file > whenever the user reads a mail on a webmail, and this file is deleted when > the computer is shut down). > > > Diffs > ----- > > CMakeLists.txt 6e55d5e > services/fileindexer/indexer/CMakeLists.txt bcf8da2 > services/fileindexer/indexer/mimeextractor.h PRE-CREATION > services/fileindexer/indexer/mimeextractor.cpp PRE-CREATION > services/fileindexer/indexer/nepomukmimeextractor.desktop PRE-CREATION > services/fileindexer/indexer/nepomukvcardextractor.desktop PRE-CREATION > services/fileindexer/indexer/nepomukwebactionextractor.desktop PRE-CREATION > services/fileindexer/indexer/vcardextractor.h PRE-CREATION > services/fileindexer/indexer/vcardextractor.cpp PRE-CREATION > services/fileindexer/indexer/webactionextractor.h PRE-CREATION > services/fileindexer/indexer/webactionextractor.cpp PRE-CREATION > > Diff: http://git.reviewboard.kde.org/r/112712/diff/ > > > Testing > ------- > > Nepomuk Core builds with this patch applied. MIME, mbox (as produced by > Thunderbird), vCard (exported from Yahoo! Mail) and webactions files are > correctly indexed. If you want to test the webaction indexer, create a file > somewhere (say "/tmp/test.txt"), and then put this in a .webaction file: > > DOWNLOAD > http://www.example.com > http://www.example.com/test.txt > /tmp/test.txt > > Then, use "nepomukindexer" to index the .webaction file. Use "nepomukshow" on > the /tmp/test.txt file, and check that everything is okay. You can also open > Dolphin and see that the "downloaded from" information of the test.txt file > is correctly displayed. > > > Thanks, > > Denis Steckelmacher > >
_______________________________________________ Nepomuk mailing list [email protected] https://mail.kde.org/mailman/listinfo/nepomuk
