On Fri, 2011-05-06 at 10:01 +0200, sean finney wrote: > Hi Milan, > > On Fri, May 06, 2011 at 08:56:10AM +0200, Milan Crha wrote: > > > As I already said seanus on irc, I will be evaluating the performance > > > between having vcards as files Vs having it in db and then choose the > > > one which would be best. So the code for both will be there and we can > > > choose between them over after testing. I was also thinking of providing > > > it as an option for the backends to choose once i complete the testing.. > > > So what we discussed stays the same :) http://git.gnome.org/browse/evolution-ews/tree/src/addressbook/e-book-backend-sqlitedb.h has the API's. Meta-data apis is work-in-progress.
> > > > This is not only about performance, my main concerns are these: > > a) if something fails with db file, user's data are safe > > > b) users can take their contacts anytime and import them on another > > machine, in case of hard disk crash, partial backup or anything like > > that > > I think we should stop and consider two different motivations for this > API. (1) Local addressbook (2) Local cache of remote addressbook. For > case (1), I agree that having the items split out could be useful and > a good safeguard against any db corruption (though my experience thusfar > with sqlite is fairly positive). > > For case (2), I would say if there's a problem with the file just nuke > it and reload it from the remote store. Since you can guarantee that > you can get a "working copy" of the info, you can then rely on the existing > UI (or sqlite, or the remote service, or whatever), for exporting the > contacts. It is a *cache* after all :) > > So for something like GAL (or any cached-from-remote addressbooks), > I think it makes a lot of sense to *not* split out the contacts, at > least as long as performance doesn't suffer by having more items in > the sqlitedb file. I wanted to check the performance on the address-books which has huge data in them between the two methods and choose the best which suits. If it turns out that there is a big difference between the two, i would document that and allow a choice for the backends to choose how they want to store the data. > > > c) folders.db files tend to grow "indefinitely". That's another point > > why I do not like "one file per account". > > I'd like to clarify a detail of the API from having looked over it wrt > evo-mapi: it's designed so that it can be used "one file per account", by > creating a single db file and specifying the "folder" as an API parameter > in all calls. > > But this means you could always create multiple db instances at different > file locations, one per folder, and just use a junk "FOLDER" (or similar) > name for the folder. Having looked over the current evo-mapi code, I > think you'd want to do soemthign like that. > > Of course if you think that there should *never* be a cas where it's used > one db per account, then rethinking the API would make sense, but otherwise > nothing lost by keeping it, it gives you a way to do both. I have made it configurable. So the clients can choose to save all the address-books in one db or provide different paths so that they would be stored in different db files. > > > An example: my evo-mapi account has 4 addressbooks (one is GAL). I would > > really prefer to have them separated, not in one large file. Not talking > > And that should be possible, see above. > > > about possible (even unlikely) UID clashes between separate > > addressbooks. Will it also mean that each local addressbook will be > > stored in one large db? Please do not do that. > > The underlying db should deal with stuff like UID clashes, agreed. I > think the current API does so, though I'm not convinced it's the best > way. Currently, you have: > > const gchar *stmt = "CREATE TABLE IF NOT EXISTS folders \ > ( folder_id TEXT PRIMARY KEY, > \ > folder_name TEXT, > \ > sync_data TEXT, > \ > bdata1 TEXT, bdata2 TEXT, > \ > bdata3 TEXT)"; > > stmt = sqlite3_mprintf ("CREATE TABLE IF NOT EXISTS %Q \ > ( uid TEXT PRIMARY KEY, \ > nickname TEXT, full_name TEXT, \ > given_name TEXT, family_name TEXT, \ > email_1 TEXT, email_2 TEXT, \ > email_3 TEXT, email_4 TEXT, \ > vcard TEXT)", folderid); > > which AIUI means a table named after every folder. Therefore the UID's > are already internally partitioned and will not conflict. WRT normalizing > the database, I would suggest something more like: > > const gchar *stmt = "CREATE TABLE IF NOT EXISTS folders \ > ( folder_id TEXT PRIMARY KEY, > \ > folder_name TEXT, > \ > sync_data TEXT, > \ > bdata1 TEXT, bdata2 TEXT, > \ > bdata3 TEXT)"; > > stmt = sqlite3_mprintf ("CREATE TABLE IF NOT EXISTS contacts > \ > ( folder_id INT, > uid TEXT, \ > nickname TEXT, full_name TEXT, \ > given_name TEXT, family_name TEXT, \ > email_1 TEXT, email_2 TEXT, \ > email_3 TEXT, email_4 TEXT, \ > vcard TEXT, > PRIMARY KEY (folder_id, uid) )" ); > On address-book deletion, dropping a table is far better than querying and deleting all the contacts that matches a folder id. But the frequency of deleting address-book's may be less. So I went for a quick search and found this, http://stackoverflow.com/questions/784173/what-are-the-performance-characteristics-of-sqlite-with-very-large-database-files which shows using mutltiple tables is better. I have not personally done any tests regarding this. > As an extra bonus that means you could do autocomplete type > queries in a single SQL query. AFAIK with the current design of eds, each address-book would be queried separately and would not benefit by this. - Chenthill. > > > > sean > _______________________________________________ > evolution-hackers mailing list > evolution-hackers@gnome.org > To change your list options or unsubscribe, visit ... > http://mail.gnome.org/mailman/listinfo/evolution-hackers _______________________________________________ evolution-hackers mailing list evolution-hackers@gnome.org To change your list options or unsubscribe, visit ... http://mail.gnome.org/mailman/listinfo/evolution-hackers