When tracker starts up, it will check whether the entries in DB are up-to-date 
or not.
Current logic is: for each file, there is at least one dbus-call from 
tracker-miner-fs to tracker-store which will execute a query. 
This is not efficient since dbus and query are expensive. (You can get the logs 
with dbus-monitor)

Here are two proposals to improve the performance.

1 Skip checks for ignored files:

In function crawler_check_directory_cb (tracker-miner-fs.c), there are two 
checks:
  
  should_check = should_check_file (fs, file, TRUE);
  should_change_index = should_change_index_for_file (fs, file);
  
As my understanding, if "should_check_file" returns FALSE, 
"should_change_index_for_file" is meaningless, since we do not process such 
files (see function "should_process_file"). So we can use the same logic in 
"should_process_file" to handle it: 

  if (should_check){
    should_change_index = should_change_index_for_file (fs, file);
  }
  else {
    should_change_index = FALSE;
  }
  
With this improvement, we can skip checks for files like ~/.cache/*, 
~/.config/*, etc.

2) Reduce dbus calls and queries:

(1) At the beginning, execute one query to get all the <url, fileLastModified> 
pairs and put them in a hash table.
(2) For each file, lookup the uri in the hash table, 
        if there is, 
            compare the time information of the file with the fileLastModified 
value from hash table,
            if the values are equal,
                The entry is up-to-date.
                
    Query is only required when it is not in the hash table or time is not 
match. 

(3) There is another issue in current implementation: 
url for "Directory" files have form like "urn:software-category" not "file:///" 
(see "miner_applications_process_file_cb" in tracker-miner-applications.c). So 
we should change the uri format before searching in hash table. 

(4) Free the hash table when miner finishes. 

In most cases, there is no or few change. With this improvement, tracker will 
become much much faster.

Thanks!
-Zhenqiang
_______________________________________________
tracker-list mailing list
tracker-list@gnome.org
http://mail.gnome.org/mailman/listinfo/tracker-list

Reply via email to