----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/112712/#review41282 -----------------------------------------------------------
services/fileindexer/indexer/mimeextractor.cpp <http://git.reviewboard.kde.org/r/112712/#comment30263> Use QString id you calculated above services/fileindexer/indexer/mimeextractor.cpp <http://git.reviewboard.kde.org/r/112712/#comment30264> Check the resource is non-empty before merging it I tested this, and it doesn't work for me. I pointed it at my ~/.thunderbird and got a bunch of messages like: nepomukstorage(21093)/nepomuk (storage service) Nepomuk2::Sync::ResourceIdentifier::runIdentification: DUPLICATE RESULTS! nepomukstorage(21093)/nepomuk (storage service) Nepomuk2::Sync::ResourceIdentifier::runIdentification: KUrl("_:uq") --> KUrl("nepomuk:/res/c8e2fb55-76f7-43ca-9ad7-1a81d997ceb3") virtuoso was sitting on a whole core (probably to run the identifications?) and the short identifiers ("KUrl("_:uq")") repeat ad infinitum. Also indexed emails never show up in dolphin search. I just applied it on top of git master - is there something else I should have applied first? - Simeon Bird On Sept. 13, 2013, 12:38 p.m., Denis Steckelmacher wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://git.reviewboard.kde.org/r/112712/ > ----------------------------------------------------------- > > (Updated Sept. 13, 2013, 12:38 p.m.) > > > Review request for Nepomuk. > > > Repository: nepomuk-core > > > Description > ------- > > This patch adds three new files extractors to Nepomuk. Two of them are of > general use, and the third (that can be removed if it hasn't its place in > Nepomuk) is specific to the use-case described in > http://steckdenis.be/post-2013-09-06-a-nepomuk-integration-plugin-for-firefox.html > . > > The MIME/mbox file extractor takes an mbox file or MIME files (as found in > Maildir directory trees) and index them as NMO:Message objects. The full > content of the e-mails is indexed along with their title, sender, receiver, > CC/BCC, date and message ID. NCO:Contacts and NCO:EmailAddress are created > when needed. The main use of this indexer is to index e-mails managed by > mutt, Thunderbird or any other e-mail client that does not use Akonadi. > > This indexer is a bit special because it also queries the Nepomuk server. > mbox files are typically huge, and change every time the user adds or removes > a mail from it. This can cause many re-indexing operations, and as the file > is big, every indexing operation can take quite a long time. To fasten the > process, the file indexer tries to find already-indexed e-mails with the same > messageID as the e-mails to be indexed. If a mail was already indexed, it is > skipped. This reduces the amount of data transferred to the Nepomuk server > (the full text of the mail doesn't have to be sent to the server only for it > to detect a duplicate message), and a mbox file that took several minutes to > index now only requires a couple of seconds. > > The vCard indexer parses vCard files using the KABC library and stores every > information found in them in NCO:Contact objects. vCard files containing more > than one contact are supported. This allows users to export their contacts > from a webmail or a contact-management application, and to have them indexed > in Nepomuk. > > The last indexer reads .webaction files, that consist of one line describing > the action "DOWNLOAD", then one parameter per line. This file indexer is used > by the Nepomuk Integration plugin for Firefox, that uses this kind of file to > establish a link between a downloaded file and its original location on the > Internet. If you don't want such a specific file indexer to be part of the > Nepomuk Libraries, it can be removed from this patch. > > All these file indexers create resources but don't touch the indexed file > itself. The reason is that a mbox file is not an e-mail, a vCard file is not > a contact (it describes a contact), and also that these files can be > temporary (for instance, the Firefox add-on creates a temporary MIME file > whenever the user reads a mail on a webmail, and this file is deleted when > the computer is shut down). > > > Diffs > ----- > > CMakeLists.txt 6e55d5e > services/fileindexer/indexer/CMakeLists.txt bcf8da2 > services/fileindexer/indexer/mimeextractor.h PRE-CREATION > services/fileindexer/indexer/mimeextractor.cpp PRE-CREATION > services/fileindexer/indexer/nepomukmimeextractor.desktop PRE-CREATION > services/fileindexer/indexer/nepomukvcardextractor.desktop PRE-CREATION > services/fileindexer/indexer/nepomukwebactionextractor.desktop PRE-CREATION > services/fileindexer/indexer/vcardextractor.h PRE-CREATION > services/fileindexer/indexer/vcardextractor.cpp PRE-CREATION > services/fileindexer/indexer/webactionextractor.h PRE-CREATION > services/fileindexer/indexer/webactionextractor.cpp PRE-CREATION > > Diff: http://git.reviewboard.kde.org/r/112712/diff/ > > > Testing > ------- > > Nepomuk Core builds with this patch applied. MIME, mbox (as produced by > Thunderbird), vCard (exported from Yahoo! Mail) and webactions files are > correctly indexed. If you want to test the webaction indexer, create a file > somewhere (say "/tmp/test.txt"), and then put this in a .webaction file: > > DOWNLOAD > http://www.example.com > http://www.example.com/test.txt > /tmp/test.txt > > Then, use "nepomukindexer" to index the .webaction file. Use "nepomukshow" on > the /tmp/test.txt file, and check that everything is okay. You can also open > Dolphin and see that the "downloaded from" information of the test.txt file > is correctly displayed. > > > Thanks, > > Denis Steckelmacher > >
_______________________________________________ Nepomuk mailing list [email protected] https://mail.kde.org/mailman/listinfo/nepomuk
