On 27 Aug 2009, at 3:21am, danjenkins wrote: > We use a julian.decimal format to represent date/time (i.e. noon of > August > 26, 2009 would be 40049.5000) and we use this julian date for an > index key. > Our databases are frequently up to 3GB in size containing 10 million > records > with 15 assorted field types per record and contain 6 months of > data. I'm > proposing to break up the julian.date into two separate integers for > the > data and time and index off the date integer. > > Of course this change will give queries some sort of a speed boost, > but by > how much would you think?
We can't give you much idea because the answer depends too much on your data and what you're SELECTing on. I can only suggest you try it. Create the other column and an index on it without deleting your existing column or index. Write some code that does 100,000 SELECTs of the type you need using each index, and compare the total times. Whichever one loses gets killed. WHERE and ORDER BY times will depend on how 'clumpy' the dates in the table are. For instance, if almost everything happens on the 15th of the month, you'll have only 1/30th as many different values as you'd have if things happened on random days including weekends. Having 1/30th as many distinct values changes the amount of searching and sorting that needs to be done dramatically. Another aspect is which fields you need to retrieve when you do your SELECT. If your select needs to retrieve the time field, and the time field doesn't appear in the index it's using, it will need to read the time from the record too. So that would increase the amount of time taken for retrieval. And lastly, of course, sorting on INTEGERs is a little quicker than sorting on REALs. And there are a few other considerations about handing and passing integers as opposed to floating-point numbers. Simon. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users