Re: [Tracker] Proposal to improve tracker-miner-fs up-to-date check performance

2010-03-30 Thread Michael Biebl
2010/3/30 Chen, Zhenqiang zhenqiang.c...@intel.com: Carlos Garnacho wrote: As Philip said, we should take into account memory usage as well, and keeping a hashtable for each known item is not going to be nice... TrackerCrawler guarantees that any directory will be processed after its parent

[Tracker] Proposal to improve tracker-miner-fs up-to-date check performance

2010-03-29 Thread Chen, Zhenqiang
When tracker starts up, it will check whether the entries in DB are up-to-date or not. Current logic is: for each file, there is at least one dbus-call from tracker-miner-fs to tracker-store which will execute a query. This is not efficient since dbus and query are expensive. (You can get the

Re: [Tracker] Proposal to improve tracker-miner-fs up-to-date check performance

2010-03-29 Thread Philip Van Hoof
On Mon, 2010-03-29 at 22:44 +0800, Chen, Zhenqiang wrote: 2) Reduce dbus calls and queries: (1) At the beginning, execute one query to get all the url, fileLastModified pairs and put them in a hash table. Problem here is that for people with a huge amount of files, the URL keys will

Re: [Tracker] Proposal to improve tracker-miner-fs up-to-date check performance

2010-03-29 Thread Carlos Garnacho
Hi!, On lun, 2010-03-29 at 22:44 +0800, Chen, Zhenqiang wrote: When tracker starts up, it will check whether the entries in DB are up-to-date or not. Current logic is: for each file, there is at least one dbus-call from tracker-miner-fs to tracker-store which will execute a query. This is

Re: [Tracker] Proposal to improve tracker-miner-fs up-to-date check performance

2010-03-29 Thread Chen, Zhenqiang
Carlos Garnacho wrote: As Philip said, we should take into account memory usage as well, and keeping a hashtable for each known item is not going to be nice... TrackerCrawler guarantees that any directory will be processed after its parent folder, and all the items in a directory will be