Thanks for your responces!

SS> This suggests that you are filling up a cache, especially if your BLOBs
are large.  In terms of overall time, 10000 queries in 2 seconds is 5000
queries a second.  If you are using a rotating hard disk then this is not
unexpected, given that your disk probably rotates at 5,400 rpm, that gives
an average latency of 5.55ms /per read

> On the first run, thought, 10000 queries take
>about 30 seconds!
KM> Sounds normal as you are loading up the OS cache with data.

In principle, of course, it's how caching works :) But, like I wrote, there
is another engine and when using it (with non-SQLite database with the same
contents), I am able to perform that 10000 queries a dozen times faster,
even on a 1st run. This means SQLite uses a non-optimal solution in this
case; well that's quite OK, 'cause that another engine was optimized for
this particular DB structure, while SQLite was not. So the question is
whether things in SQLite could be sped up by, for example, tweaking some
SQLite settings / fine-tuning DB structure / splitting a big database into
several smaller ones / etc.? If there are no brilliant ideas, then maybe
it's pure bad luck by which all the required keys are spread over the
entire file in SQLite DB, while in my other DB they are condensed in a
portion of the file so that reading just a portion of the file already
fills the cache with all the necessary data. I asked my question in
assumption (possibly wrong) that this is not the case, and that there is
another reason.

Of course, if those 10000 queries were meant to read the entire DB, then of
course any way to do that would require to ultimately scan the entire file
- every possible DB engine would have to do that. But since we speak of
just 10'000 keys (out of 130'000), then ideally only a portion of the file
has to be read. So, if this is the reason of delays (and I am not sure it
is, but isn't it probable?), - if SQLite reads more data than it would be
ideally necessary, - is there a way to tweak something in such a way to
reduce that excessing file accesses?

KM>   In order for caching to work on a USB attached device, it must not be
mounted in "let the user pull it out at any time" mode (the default), or
what Microsoft calls "optimize for random user yanking", but must be
mounted as a connected device (called optimized for performance).
Well OK if you say so, thought I believed that is only true for
write-caching (which of course is dangerous if the device is removable),
while read-caching does not depend on whether the drive is removable or
not. Seemed logical to me, but might be wrong.

SS> Is there a difference between a key being present in your database with
NULL value, and the key not being present ?
Surely there is. But, actually, those 10000 queries are made only to keys
with a non-NULL value (usually less than 1 Kb, and maybe 20% are about 16
kb).


2015-01-08 16:24 GMT+05:00 Keith Medcalf <kmedc...@dessus.com>:

>
> >My database weights a little less than 2 Gbs and contains 130'000 keys.
> >When I put it on HDD and try to make 10000 queries (extracting values for
> >10000 different keys) with some additional processing of extracted
> >values, it takes about 4 seconds on my PC on any run except the first,
> >with maybe half of that time being the "additional processing" mentioned;
> >even when I perform every query 3 times in a row (making it 30000
> queries),
> >this time does not change. On the first run, thought, 10000 queries take
> >about 30 seconds!
>
> Sounds normal as you are loading up the OS cache with data.  Accessing
> data cached in RAM is much faster than accessing data on spinning disk.
> Have you set SQLite's page cache equal to a reasonable working set since
> access to application cached pages is even faster than access to the OS
> file cache -- and it avoids a jump into the kernel to do I/O which puts you
> at the mercy of the dispatcher.  Are you 100% I/O bound during the long
> (initial) run, and 100% CPU bound during the quicker runs?
>
> >When I put the file on a USB flash drive, somehow I always get about 45
> >seconds total time, on either first run and subsequent runs. When the
> >queries are tripled, the total time is tripled as well (even though the
> >portions of the file to be read should already be cached when every
> >enuqie query is repeated 2 more times).
>
> >This leads me to the conclusion that 1) The delays are produced by
> >physical reading of the file, not by searching for a key or returning
> >the value;
>
> >2) file on USB never gets cached (why?? due to some file-mode flags used
> by
> >SQLite engine? or it's just a MS Windows 8 issue?)
>
> This is an OS "feature".  In order for caching to work on a USB attached
> device, it must not be mounted in "let the user pull it out at any time"
> mode (the default), or what Microsoft calls "optimize for random user
> yanking", but must be mounted as a connected device (called optimized for
> performance).  You should also make sure you have it formatted as NTFS and
> not as a FAT variant of some type.
>
> Further, USB itself is very slow, both in transfer time and especially in
> turnaround time.  Flash drives are also very slow, usually because they do
> not need to operate at a speed faster than can be supported by the slow USB
> interface.  Is this a USB 1.0, 2.0, or 3.0 connection and device?
>
>
>
>
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to