Valentin Davydov wrote:
On Thu, Nov 21, 2013 at 11:43:32AM +0200, Baruch Burstein wrote:
Hi all,

I know SQLite is supposed to support DB sizes in the TB (I think the
default configuration can reach 1TB). I am curious if anyone actually uses
SQlite at anywhere near this.

Yes.

Does anyone use it regularly for DBs 500GB+
in size, or with tables containing 10 billion rows+?

I've got an installation with SQLite DB of several terabytes in size.
It contains about 20 billions thoroughly indexed records, and grows every
day (more precisely, every night) by a few tens of millions of new records.

How much concurrency does your use require?

I've spent some efforts to eliminate concurrency in application. That is,
updates and selects occur at very different times of the day.

How long do selects take (assuming indexes are set correctly?)

It depends of the size of the select. Single row is selected instantaneously.
Check of the uniqueness takes about 1-2 minutes per 1 million of records, most
of time being spent parsing SQL commands. Whereas aggregate functions over
substantional fraction of the entire database, of course, take too long to
be executed in real time.

Use SQLightning instead - concurrency issues are irrelevant then, since writers don't block readers. And it will search multiple gigabytes per second, as opposed to your millions-per-minute figure above.

Are there problems of locking ("normal" SQLite doesn't usually
suffer from locking since transactions are very quick, but if transactions
can be in the order of 100's of ms, I think locking can easily happen if
the DB is accessed concurrently a few times a second, though I am not sure
if this may only apply to writes).

Yes. Single transaction (insertion of that tens of millions of new recors)
takes hours in the worst case.

There's no good reason for insertion of 10 million records to take hours.

I understand that the answer to most of these questions can be very
hardware (and software) dependent,

Indeed not so. The only hardware capable of storing such amount of data
is an array of magnetic disks, and their latency time (about 10-20 ms for
random access) is much more than any reasonable software overhead. Even
cache (internal SQLite page cache and/or operation system file cache)
occupies the same memory and therefore has almost the same effect. The
only software which determines the performance is SQLite itself, in my
case, perhaps, trees rebalancing algorithm.

1TB SSDs are only ~$500; there's no reason to limit yourself to the slowness of magnetic disks these days.

http://www.amazon.com/Samsung-Electronics-EVO-Series-2-5-Inch-MZ-7TE1T0BW/dp/B00E3W16OU

SQLightning uses only the OS filesystem cache, so you get maximal use of the available system RAM instead of wasting half of it with redundant copies in application-level caches.

but I am just trying to get a feel for
SQLite's applicability for a project I am working on that may reach limits
like these.

The only definive SQLite limits are documentet in the relevant manual page.

--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to