Thanks for your responces! SS> This suggests that you are filling up a cache, especially if your BLOBs are large. In terms of overall time, 10000 queries in 2 seconds is 5000 queries a second. If you are using a rotating hard disk then this is not unexpected, given that your disk probably rotates at 5,400 rpm, that gives an average latency of 5.55ms /per read
> On the first run, thought, 10000 queries take >about 30 seconds! KM> Sounds normal as you are loading up the OS cache with data. In principle, of course, it's how caching works :) But, like I wrote, there is another engine and when using it (with non-SQLite database with the same contents), I am able to perform that 10000 queries a dozen times faster, even on a 1st run. This means SQLite uses a non-optimal solution in this case; well that's quite OK, 'cause that another engine was optimized for this particular DB structure, while SQLite was not. So the question is whether things in SQLite could be sped up by, for example, tweaking some SQLite settings / fine-tuning DB structure / splitting a big database into several smaller ones / etc.? If there are no brilliant ideas, then maybe it's pure bad luck by which all the required keys are spread over the entire file in SQLite DB, while in my other DB they are condensed in a portion of the file so that reading just a portion of the file already fills the cache with all the necessary data. I asked my question in assumption (possibly wrong) that this is not the case, and that there is another reason. Of course, if those 10000 queries were meant to read the entire DB, then of course any way to do that would require to ultimately scan the entire file - every possible DB engine would have to do that. But since we speak of just 10'000 keys (out of 130'000), then ideally only a portion of the file has to be read. So, if this is the reason of delays (and I am not sure it is, but isn't it probable?), - if SQLite reads more data than it would be ideally necessary, - is there a way to tweak something in such a way to reduce that excessing file accesses? KM> In order for caching to work on a USB attached device, it must not be mounted in "let the user pull it out at any time" mode (the default), or what Microsoft calls "optimize for random user yanking", but must be mounted as a connected device (called optimized for performance). Well OK if you say so, thought I believed that is only true for write-caching (which of course is dangerous if the device is removable), while read-caching does not depend on whether the drive is removable or not. Seemed logical to me, but might be wrong. SS> Is there a difference between a key being present in your database with NULL value, and the key not being present ? Surely there is. But, actually, those 10000 queries are made only to keys with a non-NULL value (usually less than 1 Kb, and maybe 20% are about 16 kb). 2015-01-08 16:24 GMT+05:00 Keith Medcalf <kmedc...@dessus.com>: > > >My database weights a little less than 2 Gbs and contains 130'000 keys. > >When I put it on HDD and try to make 10000 queries (extracting values for > >10000 different keys) with some additional processing of extracted > >values, it takes about 4 seconds on my PC on any run except the first, > >with maybe half of that time being the "additional processing" mentioned; > >even when I perform every query 3 times in a row (making it 30000 > queries), > >this time does not change. On the first run, thought, 10000 queries take > >about 30 seconds! > > Sounds normal as you are loading up the OS cache with data. Accessing > data cached in RAM is much faster than accessing data on spinning disk. > Have you set SQLite's page cache equal to a reasonable working set since > access to application cached pages is even faster than access to the OS > file cache -- and it avoids a jump into the kernel to do I/O which puts you > at the mercy of the dispatcher. Are you 100% I/O bound during the long > (initial) run, and 100% CPU bound during the quicker runs? > > >When I put the file on a USB flash drive, somehow I always get about 45 > >seconds total time, on either first run and subsequent runs. When the > >queries are tripled, the total time is tripled as well (even though the > >portions of the file to be read should already be cached when every > >enuqie query is repeated 2 more times). > > >This leads me to the conclusion that 1) The delays are produced by > >physical reading of the file, not by searching for a key or returning > >the value; > > >2) file on USB never gets cached (why?? due to some file-mode flags used > by > >SQLite engine? or it's just a MS Windows 8 issue?) > > This is an OS "feature". In order for caching to work on a USB attached > device, it must not be mounted in "let the user pull it out at any time" > mode (the default), or what Microsoft calls "optimize for random user > yanking", but must be mounted as a connected device (called optimized for > performance). You should also make sure you have it formatted as NTFS and > not as a FAT variant of some type. > > Further, USB itself is very slow, both in transfer time and especially in > turnaround time. Flash drives are also very slow, usually because they do > not need to operate at a speed faster than can be supported by the slow USB > interface. Is this a USB 1.0, 2.0, or 3.0 connection and device? > > > > > > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users