-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/120438/
-----------------------------------------------------------

Review request for Baloo.


Repository: baloo


Description
-------

Current Baloo's file indexer takes batches of files by its command line and 
processes them async. There are several iterations over and creation of various 
collection classes. This patch set does away with all of this.

Instead, the worker process now is directed over stdin and responds over stdout 
(old school UNIX!). This is done using a very simple protocol which is fully 
documented in docs/ and is encapsulated in a convenience class so as to make 
future maintenance relatively painless in the advent that the protocol is 
adjusted or an entirely new system is put inplace. IOW, the next me should have 
an easier time improving on what is there. This allows the worker process to be 
long(er)-lived and moves more of the relevant bookkeeping to the same place 
(the worker process). The resulting code is about the same size but cleaner and 
much more performant.

The tracking of file progress is also moved to the (less expensive, easier to 
update incrementally) sqlite db where file data is already kept. It also 
versions the sqlite schema so that runtime upgrades can be made (as required 
here) both now and in the future without user intervention.

Finally, a tiny (93 LOC, including header) executable that wraps the protocol 
encapsulation class is provided for 100% backwards compatibility of *all* 
command line options that currently exist.

This makes it a 100% drop-in replacement.

Performance of the affected unit tests is ~2.5x better on my laptop: 7.6 v 3.0 
seconds.

Known weaknesses:

* still isn't hardened against multiple processes indexing simultaneously. the 
added sqlite columns are a step in this direction, howeer
* the DBus signal that was being emitted was removed and has not been replaced 
with anything yet. That is because the affected class is in baloo-widgets 
(which happens to be the only user of the relevant *public* class in the baloo 
file library) and I have not yet worked on that, pending this patch set being 
approved in principle. I am actually of the suspicion that keeping that data 
synchronized at all times is not needed .. but if it is, there are much better 
ways than DBus. Note that the DBus overhead was applied to *every single file 
indexed* even if Dolphin (the only user of this feature) wasn't running or had 
the metadata frame open.

The base idea behind this patch is to make file indexing as blazing fast as 
possible, which means using as few resources, CPU & memory, as possible. That 
ought to be the #1 priority for Baloo when it comes to indexing files, both to 
keep it away from user's notice on desktop class hardware as well as extend its 
relevance to smaller devices. The code in this patch is at least as clear (imho 
more so, actually) than what it replaces and was written for maintainability.


Diffs
-----

  src/file/extractor/app.cpp PRE-CREATION 
  src/file/extractor/baloo_file_extractor.h 
a8cff51ae763c36fb2934c80c04cbf0f710a01c7 
  src/file/extractor/baloo_file_extractor.cpp 
b2e7f97f2f0537807c314e2aca008d67711a69ce 
  src/file/extractor/extractorworker.cpp 
e710908dc8b02b7edc9f40e44b9d5d5883986846 
  src/file/extractor/main.cpp 606a1eb9d42b7079fee497a3fdffbe2327512c13 
  src/file/fileindexingjob.h PRE-CREATION 
  src/file/fileindexingjob.cpp PRE-CREATION 
  src/file/fileindexingqueue.h 5b1c612feb84046acf36b3dec3bf890da2cbf650 
  src/file/fileindexingqueue.cpp c6479cd25308bee9bb6fe6e6791391e6ca83bddf 
  src/file/indexingqueue.cpp c8ae4d9d1436ebe222a4758bd5ce551e091ec714 
  src/file/lib/CMakeLists.txt fb158a5f00f07ab48de963bf72924c83fc1c6673 
  src/file/lib/extractorclient.h 1e2ae09c7bd0068ca185b814c5c4996c0b563856 
  src/file/lib/extractorclient.cpp 221b90abe1899c27c03cf77e279fb71afce80b4d 
  src/file/lib/filemapping.cpp 5769fe23576c341917e8f37f7dd2743dd446a6e1 
  src/file/main.cpp 41855215c7b550c403d5ef523d29f9b73db58cb0 
  src/file/metadatamover.cpp 5117687c1247275235c3830b9d2234bfff353748 
  src/file/util.h a4cc2423fd46cda7146b02c572cc3f4ccd069e3e 
  src/file/util.cpp 2f7ac6de0dddc7d71c2b36bb4fe88677bfe52336 
  src/file/autotest/fileindexingjob/fileindexingjobtest.h PRE-CREATION 
  src/file/autotest/fileindexingjob/fileindexingjobtest.cpp PRE-CREATION 
  src/file/autotest/scheduler/schedulertest.cpp 
b91a481138382b609068b4737c654ac5182c37ec 
  src/file/basicindexingjob.h 3d2fb339e38be862332ddca45b4926f8000fa0c1 
  src/file/basicindexingjob.cpp 8d70fa2b1ca737eaa8d2e717a0eb2dbf430fac84 
  src/file/basicindexingqueue.cpp 8484c9b6868890cae6c2e19b29ea92d500aac03e 
  src/file/commitqueue.cpp b26f73cfdbc390cceef01aa6abd2b5af6f9689b2 
  src/file/database.h 6548de7dbce003112dad8c58c39c624dd806a261 
  src/file/database.cpp 5c9ccdc442d230bdc5dc9edbe5bc595537d08a8f 
  src/file/extractor/CMakeLists.txt 7cf3478f547f7fbcdf8908a3471b50fab09b5220 
  src/file/extractor/app.h 8059a16e8ec2e2a84d1a3b7eb1dd8cfc2a3136de 
  .gitignore 1377554ebea6f98a2c748183bc5a96852af12ac2 
  docs/file_extractor.md 3d830e2313bd35994f9fd1d6b52df422d0556fd8 
  src/file/CMakeLists.txt 1a9e6bf8f2c70b096fbb2c6c9eb78feec9d06622 
  src/file/autotest/CMakeLists.txt ccb8575fa8c07f2b136d259ed63d43a09ad75a12 
  src/file/autotest/basicindexingjobtest.h PRE-CREATION 
  src/file/autotest/basicindexingjobtest.cpp PRE-CREATION 
  src/file/autotest/fileindexingjob/CMakeLists.txt PRE-CREATION 
  src/file/autotest/fileindexingjob/extractor.cpp PRE-CREATION 

Diff: https://git.reviewboard.kde.org/r/120438/diff/


Testing
-------


Thanks,

Aaron J. Seigo

>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<

Reply via email to