Re: [sqlite] Coding standard
On Thu, Dec 12, 2019 at 11:19:44AM -0500, Richard Hipp wrote: > > #define sqlite3Strlen30NN(C) (strlen(C)&0x3fff) > > The tool does not provide any details beyond "Use of strlen". So why not just #define sqlite3Strlen30NN(C) (strnlen(C,0x3fff)) ? From the point of view of program logic it looks similar (at least for me), but shifts security burden from you to authors of libc. And of course this should calm static analyzers anxious about strlen(), sprintf() etc. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to increase performance when inserting a lot of small data into table using indices
On Tue, Sep 10, 2019 at 05:25:38PM +0200, mailing lists wrote: > Hi, > > I cannot really put all the inserts into one transaction because in case of a > failure I loose all the already inserted data. Though I made some tests. > There is hardly any performance gain anymore when doing 1000 or 10 000 > insertions in one transaction including immediate insertion into indices (in > my case the difference is in the per cent range). What do you mean "to loose data"? Do you need them to be immediately available via SQL, or just written to persistent storage? In the latter case you can implement your own data cache, like sequentilal log files, which will be periodically (and/or on demand) rotated, and afterwards asynchronously parsed, inserted into the SQLite database with optimized CACHE_SIZE, transaction size, journal mode etc, and deleted only after successfull commit. Thus you shift the burden from SQL to filesystem which is less limited by natural data structure and might perform better. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] 2D query
Hi, all! Given the following table: CREATE TABLE people(name,sex); INSERT INTO people VALUES("Alex","F"); INSERT INTO people VALUES("Alex","M"); INSERT INTO people VALUES("Jane","F"); INSERT INTO people VALUES("John","M"); How to construct a query which returns coalesced sex but individual names, such as "F: Alex, Jane. M: Alex, John."? Sincerely, Valentin Davydov. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] whish list for 2016
On Fri, Dec 25, 2015 at 12:51:55PM +, Simon Slavin wrote: > > On 25 Dec 2015, at 12:39pm, Valentin Davydov > wrote: > > > It would be desirable to improve algorithm of INTEGRITY_CHECK pragma. > > Presently it is generally useless for indexed databases which don't fit > > entirely in RAM (and which usually need checking much strongly than > > smaller ones). > > Valentin, could you expand on that ? As far as I understand, INTEGRITY_CHECK simply iterates over the records (of tables and indices) one by one in some arbitrary order. So, if the database is too big to fit in the available memory (sqlite's own cache, system file cache etc), then each iteration implies a random seek on disk(s), or even several ones in some scenarios. So, check of a few terabytes database with some tens billions of records and a dozen of indices would take more than 10^11 disk operations of more than 10 milliseconds each. That is, years. My wish is to involve some intermediate level of cache (perhaps bitmaps?) which could reside in reasonable amount of memory (say, < 1% of the total database size) and obviate the need to read each page more than once. So, things would speed up in proportion of page size to record size. > PRAGMA schema.foreign_key_check; > > ? Does it have the same problem as far as you're concerned ? I don't have access to a huge databases with foreign keys now. Valentin Davydov.
[sqlite] whish list for 2016
Hi, All! It would be desirable to improve algorithm of INTEGRITY_CHECK pragma. Presently it is generally useless for indexed databases which don't fit entirely in RAM (and which usually need checking much strongly than smaller ones). Valentin Davydov.
[sqlite] Native sqlite4 on FreeBSD
On Thu, Sep 10, 2015 at 01:30:37PM +0200, Eduardo Morras wrote: > > Use gmake to compile. It didn't work either. Finally I've just installed some brand new linux on a nearby virtual machine, made there make -f Makefile.linux-gcc and thoroughly repeated it's output line-by-line on my FreeBSD while replacing gcc by cc. Eventually it compiled well (with almost the same set of warnings) and the result seems to work (at least within my own coverage). Thanks to the high general coding style, there were only two idiosyncrasies to be corrected: lack of the system-wide malloc.h (which is replaced by unistd.h in FreeBSD) and yet another lack of the fdatasync() syscall, which is already carefully screened from the sources by a special symbol called __ANDROID__ ;-) By the way, clang kindly revealed a couple of forgotten "unsigned" in the sources, which could (and did in the past) produce some nice bugs. Attention to developers. > Note that there isn't a port, I know. Despite the code being frozen for almost a full year, nobody wants to take a trouble of maintaining FreeBSD port (perhaps me too). Valentin Davydov.
[sqlite] Native sqlite4 on FreeBSD
Hi, all! How to build native sqlite4 (preferrably statically linked shell only, without any extensions) on a recent version of FreeBSD? It has clang instead of gcc as a default system compiler, so makefiles bundled with sqlite sources don't work. Valentin Davydov.
[sqlite] AUTOINC vs. UUIDs
On Wed, May 20, 2015 at 11:52:08PM +, Peter Aronson wrote: > Now you're just getting silly. ?What if the application sets all rowids, > everywhere to 1? ?The fact is, the chance of collision on a UUID is pretty > astronomically low as long as a decent source of entropy is used > (see?http://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates). > ?Yes, some application might not generate proper UUIDs, but that's true > with any scheme that needs to coordinate disconnected data editing or > generation on multiple machines. Moreover, there are widespread examples of colliding UUIDs, say EBD0A0A2-B9E5-4433-87C0-68B6B72699C7. This means that this idea have already proven to fail on it's intended usage. > ?There are lots of applications out there that use UUIDs pretty successfully. Much less than a number of applications which use integers ;-) Val. Dav.
Re: [sqlite] Using Sqlite3 as a Change Time Recording Data Store in Glusterfs
On Fri, Nov 21, 2014 at 02:01:39PM -0500, Joseph Fernandes wrote: > We wanted to known the following > 1) How could we improve the performance on the write side so that we have > minimal latency? > 2) Will ther be any write performance hit when the number of records in the > DB increase? Generally speaking, you have to do some work to arrange your data (modification times) in some ordered way. This work can be done eihter in advance, as you suggest, or on demand, as some people have already told you. But anyway this will eat up necessary resources, regardless of whose code would do it, either yours or SQLite. In practice (given magnetic disks as underlying storage), most scarce of the mentioned resources is rotational/seek latency, which detrimentally affects all disk operations of any scheduled priority. SQLite performs extensive random disk access (mostly reads) on most operation scenarios - selects, inserts, indexing etc. with possible exception of small updates of non-indexed data (which are accessed in a similar fashion by later selects). The only way to cope with the slow disk is keeping all necessary data somwhere else, for example, into the RAM cache. Of course, cache itself should be populated in advance to give this benefit, and, given current RAM prices, it seems not very feasible to steal available memory from smart applications in favour of dumb cache. Hope, this considerations will help you in tuning your code. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] 50% faster than 3.7.17
On Fri, Sep 19, 2014 at 09:14:17PM -0400, Richard Hipp wrote: > The latest SQLite 3.8.7 alpha version (available on the download page > http://www.sqlite.org/download.html) is 50% faster than the 3.7.17 release > from 16 months ago. That is to say, it does 50% more work using the same > number of CPU cycles. Is there any similar benchmarks with regard to disk i/o operations rather than CPU? Especially random read of cache misses, I mean. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow select from database
On Wed, Mar 12, 2014 at 02:38:15PM +0400, Георгий Жуйков wrote: > 1 . We have a database of measurements: time DATETIME, name TEXT, value > NUMERIC > indexes: > 'tags_name_index' ON 'TAGS' ('NAME' ASC) > 'tags_name_itemtime_index' ON 'TAGS' ('NAME' ASC ', ITEMTIME' ASC) > In case of record auto_vacuum=INCREMENTAL flag is used [skip] > 4 . The request of data is made for time slot, i.e. from several databases. > For example: > SELECT COUNT (*) as COUNTER FROM Entries WHERE (TIMESTAMP BETWEEN @STARTTIME > AND @ENDTIME) > SELECT * from Entries WHERE (TIMESTAMP BETWEEN @STARTTIME AND @ENDTIME) of > ORDER BY TIMESTAMP DESC LIMIT 1000 OFFSET 0 I can't comletely understand your database schema, but why didn't you use index on just DATETIME field when selecting data of given time period? > 5 . Initially all requests are expedited. > After a while (about 50 minutes) the same requests start being executed more > slowly, request to each database (from 1 to 30 minutes). > Repeated request of the same data - quickly. > The System.Data.SQLite.x86 updating to version 1.0.91.3 doesn't bring any > positive result. > > What to do? First try to figure out where it spends so much time (I suspect random disk reads: it seems to be no other slow physical phenomena in your system, except probably database locking by some other process which you didn't mention). Then either revise logic of your application (e.g. by adding an abovementioned index, changing autovacuum to forced vacuum sceduled at specific time of day etc.) or give it more resources (say, by putting the database on an SSD drive). Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Sat, Nov 23, 2013 at 06:18:29AM -0500, Richard Hipp wrote: > On Sat, Nov 23, 2013 at 5:26 AM, Pepijn Van Eeckhoudt < > pep...@vaneeckhoudt.net> wrote: > > > Is datetime special in thuis context or will constant expression hoisting > > like this happen for any function? > > SQLite must know that the function always gives the same output given the > same inputs. No every function works that way. Counterexamples include > random() and last_insert_rowid(). But most built-in functions are > factorable in the same way that datetime() is. Wait a second... and you'll get different value of datetime('now'). In this sense datetime() is as deterministic as random(): it may give the same result next invocation or may not, dependng on various circumstances not related to the function itself. > Currently there is no API to designate an application-defined function as > being "constant" in the sense that it always generates the same output > given the same inputs. Hence, SQLite assumes the worst about > application-defined functions and never tries to factor them out of the > inner loop. Probably we should add a new API that allows the application > to state auxiliary properties about application-defined functions (such as > whether or not it is "constant", whether or not it can return NULL, whether > or not it might change the encoding of its input parameters, etc.) But > that has not been done yet. To my opinion, the most general solution is to let to the application programmer to decide whether to calcucale the function once (say, at the beginning of a transaction), store the result and then access the stored value, or to make the new call to the function each iteration, depending on the application semantics. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite scalability
On Thu, Nov 21, 2013 at 11:43:32AM +0200, Baruch Burstein wrote: > Hi all, > > I know SQLite is supposed to support DB sizes in the TB (I think the > default configuration can reach 1TB). I am curious if anyone actually uses > SQlite at anywhere near this. Yes. > Does anyone use it regularly for DBs 500GB+ > in size, or with tables containing 10 billion rows+? I've got an installation with SQLite DB of several terabytes in size. It contains about 20 billions thoroughly indexed records, and grows every day (more precisely, every night) by a few tens of millions of new records. > How much concurrency does your use require? I've spent some efforts to eliminate concurrency in application. That is, updates and selects occur at very different times of the day. > How long do selects take (assuming indexes are set correctly?) It depends of the size of the select. Single row is selected instantaneously. Check of the uniqueness takes about 1-2 minutes per 1 million of records, most of time being spent parsing SQL commands. Whereas aggregate functions over substantional fraction of the entire database, of course, take too long to be executed in real time. > Are there problems of locking ("normal" SQLite doesn't usually > suffer from locking since transactions are very quick, but if transactions > can be in the order of 100's of ms, I think locking can easily happen if > the DB is accessed concurrently a few times a second, though I am not sure > if this may only apply to writes). Yes. Single transaction (insertion of that tens of millions of new recors) takes hours in the worst case. > I understand that the answer to most of these questions can be very > hardware (and software) dependent, Indeed not so. The only hardware capable of storing such amount of data is an array of magnetic disks, and their latency time (about 10-20 ms for random access) is much more than any reasonable software overhead. Even cache (internal SQLite page cache and/or operation system file cache) occupies the same memory and therefore has almost the same effect. The only software which determines the performance is SQLite itself, in my case, perhaps, trees rebalancing algorithm. > but I am just trying to get a feel for > SQLite's applicability for a project I am working on that may reach limits > like these. The only definive SQLite limits are documentet in the relevant manual page. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Write performance question for 3.7.15
On Fri, Dec 28, 2012 at 03:35:17PM -0600, Dan Frankowski wrote: > > 3. Would horizontal partitioning (i.e. creating multiple tables, each for a > different key range) help? This would seriously impair read performance (you'd have to access two indices instead of one). Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fwd: Write performance question for 3.7.15
On Fri, Dec 28, 2012 at 03:34:02PM -0600, Dan Frankowski wrote: > I am running a benchmark of inserting 100 million (100M) items into a > table. I am seeing performance I don't understand. Graph: > http://imgur.com/hH1Jr. Can anyone explain: > > 1. Why does write speed (writes/second) slow down dramatically around 28M > items? Most probably, indices became too large to fit in the in-memory cache. You can verify this by tracing system activity: this threshold should manifest itself by drastical increase in _read_ operations on disk(s). > 2. Are there parameters (perhaps related to table size) that would change > this write performance? CACHE_SIZE. It makes sense to enlarge it up to the all available memory. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any limitation of file size on sqlite?
On Sun, Dec 16, 2012 at 10:02:41PM +0200, ?a?lar Orhan wrote: > Hello again, > Thank you Simon. > The page http://www.sqlite.org/limits.html has my answer partly. In MS IIS > what gbytes bigger size of .sqlite file should significant to work on it > properly? > I mean, is there a 10-12 GB sqlite file that working properly and with any > problem? There is only limitation on the total number of pages in the file. So, the file size is limited by SQLITE_MAX_PAGE_COUNT times PAGE_SIZE. Thus, given enough page size, say, 8 to 64 kilobytes, sqlite successfully handles files of many terabytes in size, provided underlying filesystem supports them, of course. Be prepared to wait quite a long for non-sqlite operations (copy, delete etc.) to complete on such a big files. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] IP address support (was: Consequences of lexicographic sorting of keys in SQLite4?)
On Mon, Jul 02, 2012 at 05:05:25PM +0100, Niall O'Reilly wrote: > > On 2 Jul 2012, at 16:13, Nico Williams wrote: > > > That reminds me: it'd be nice to have a bit string type, since the > > correct way to sort IPv4 CIDR blocks is as bit strings. > > Nice, definitely! > > > This is also > > a proper way to sort IPv6 blocks. Alternatively, it'd be nice to have > > native IP address types in SQLite4, as otherwise one has to jump > > through hoops to handle IP addresses properly. > > Bit strings would be more general. > Native IP would remove a sometimes-asserted motivation for preferring > PostgreSQL. > > As I see it, ranges, as well as single addresses and CIDR prefixes, > need to be supported, perhaps like the Perl Net::IP module does. Individual IP addresses are nicely supported in the form of unsigned integers, and prefixes/ranges - as contiguous ranges of such integers. For example, to determine whether given IP address belongs to a particular subnet, one can calculate "IP between NETWORK_MIN and NETWORK_MAX", which sqlite does quite efficiently. This is for IPv4 at least. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Can't create empty database
On Mon, Jun 25, 2012 at 01:01:46AM -0700, L Anderson wrote: > Googling on how to create a sqlite database (empty one) it appears > I need only do 'sqlite3 test.db'. However, when I try that I get: > > ->sqlite3 test.db > SQLite version 3.7.13 2012-06-11 02:05:22 > Enter ".help" for instructions > Enter SQL statements terminated with a ";" > sqlite> > > however, no database is created. What am I doing wrong and what > do I need to do to create an empty database 'test.db'? Perhaps you have to fill the database with some of your own data (and later delete them if you want really empty DB). Otherwise it will not even remember your PRAGMA settings etc.: $ rm -f test.db $ sqlite3 test.db "pragma page_size=512;" $ wc -c test.db 0 test.db $ sqlite3 test.db "pragma page_size;" 1024 $ sqlite3 test.db "pragma page_size=512; create table t(c);drop table t;vacuum;" $ wc -c test.db 512 test.db $ sqlite3 test.db "pragma page_size;" 512 $ Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] index creation for single-use DB
On Sun, May 13, 2012 at 11:02:54AM +0300, Baruch Burstein wrote: > I am using an in-memory DB to load data into it, do a few sort / find > duplicates / SELECTs, and then dispose of the DB. It can vary in size from > a few thousand rows to over a million. > Would the time used for creating an index be worth it for just a single > sort and a few SELECTs? It depends on the selects. If more than 2-3 of them iterate over entire table, then appropriate indexing would help. In fact, indexing of a column comprises iterating over it and sorting the result (besides storing the index in memory/file). > If so, would it be faster to do all the INSERTs and > then add the index, or create the index and then do the INSERTs? The former is faster, because all necessary information is already available. Вал. Дав. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] is SQLite the right tool to analyze a 44GB file
On Mon, May 07, 2012 at 01:34:13PM -0400, peter korinis wrote: > So, if I use gawk to change my comma separated file to | (pipe) delimiter . > will it work? If you use gawk, you can easily change your comma separated file to the series of correct SQL INSERT statements (besides ensuring validity of input data such as right number of fields in each row etc.) and not rely on the sqlite shell csv-parsing capabilities. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] is SQLite the right tool to analyze a 44GB file
On Fri, May 04, 2012 at 11:23:42AM -0400, peter korinis wrote: > > After I get the data loaded and inspect for nulls in prospective index > attributes, can I add indices? Yes, of course. Moreover, it would be much faster to add indices at once at the end rather than create them beforehand and then update with every new piece of data. > I was planning to load using sqlite3 CLI ".import" command. Is there a way I > can monitor the progress of the load, with only minimal impact on > performance ? You can monitor (by some external means) either the database file size or read pointer position in the input data file. Both of them grow linearly with amount of data processed. > I've started several loads only to find out hours later that > nothing has been loaded. Anyway be prepared to spend some of your time on learning. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] is SQLite the right tool to analyze a 44GB file
On Tue, May 01, 2012 at 04:06:01PM -0400, peter korinis wrote: > I'm new to SQLite . not a programmer . not a DBA . just an end-user with no > dev support for a pilot project (single user, no updates, just queries). > > > > I want to analyze the data contained in a 44GB csv file with 44M rows x 600 > columns (fields all <15 char). Seems like a DBMS will allow me to query it > in a variety of ways to analyze the data. Yes, SQLite is quite capable of doing simple analyzis of such amounts of data, especially selecting small subsets based on a simple criteria. However before trying to do some real work you have to understand the structure of your data, realize your possible queries and carefully design database schema (tables and, equally important, indises). Perhaps, putting all data in a single 600-column table is not a good idea (though allowed technically), especially if your columns are equal by their physical nature: it is not so easy to select arbitrarily calculated columns, only rows. > I have the data files and SQLite on my laptop: a 64-bit Win7 Intel dual-proc > with 4GB RAM + 200GB free disk space. Well-indexed database of small data pieces usually takes up several times more disk space than the raw data. Probably 200GB would not be enough, dependng mostly on the number of indises. Consider dedicating a separate disk (or even RAID array) for it. > End-user tools like Excel & Access failed due to lack of memory. I > downloaded SQLite ver.3.6.23.1. I tried to use Firefox Data Manager add-on > but it would not load the csv files - 'csv worker failed'. So I tried > Database Master from Nucleon but it failed after loading (it took 100 > minutes) ~57,000 rows with error message = 'database or disk is full". I > tried to create another table in the same db but could not with same error > message. The DB size shows as 10,000KB (that looks suspiciously like a size > setting?). Try bare sqlite shell instead of those external tools. It should take at least several hours to fill up your database. > From what I've read SQLite can handle this size DB. Surely it can. In one of my projects the database takes up almost 3 terabytes of disk space, contains more than 10^10 records and still provides small selects of indexed data in real time. > 1. Is SQLite the wrong tool for this project? (I don't want the > overkill and admin overhead of a large MySQL or SQL Server, etc.) It depends on the data structure, semantics and what you are going to find there. SQLite isn't very good for calculation of complex aggregate functions, but works fine in simple selecting and sorting. > 2. If SQLite will work, are there configuration settings in SQLite or > Win7 that will permit the load . or is there a better tool for this project? Increasing PAGE_SIZE to match the filesystem block (cluster) size and perhaps CACHE_SIZE to fill most of the available RAM would help a bit. Also, don't forget to turn off journaling and wrap all in a single transaction when creating database for the first time. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] question about zipvfs & performance
On Sun, Mar 25, 2012 at 07:48:51PM +0200, Tal Tabakman wrote: > Hi, > I am writing an application that performs a lot of DB writes. I am using a > lot of recommended optimizations (like using transactions and more...) > I want to improve my recording time by reducing the amount of I/O. one way > to do so is by compressing the data before dumping it to DISK. > I am evaluating a sqlite extension called zipvfs. this VFS extension > compresses pages before writing them to disk > I am using zlib compress/uncompress as my compression callback functions > for this VFS. I assumed that database writing will be faster with this VFS > since > compression [means less I/O], in reality I see no difference (but the data > is indeed compressed)... > any idea why I don't see any recording time improvement ? Yes. If you are using drive with rortating magnetic plates, then the most critical stage is seek (latency) time, rather than linear throughput. In other words, you are limited by a _number_ of random i/o operatitions, and not by an _amount_ of the information written. So, if you want to improve your write performance, then your have to use low-latency storage, such as SSD drive for small databases or RAID array with plenty ow write cache memory for huge ones. > is there an overhead with zipvfs ? You can easily measure this overhead yourself on the in-memory database. RAM is cheap now ;-) Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow inserts in indexed table
On Thu, Feb 09, 2012 at 02:05:53PM +, Simon Slavin wrote: > > On 9 Feb 2012, at 1:45pm, Valentin Davydov wrote: > > > CREATE TABLE T(X, Y, Z, ..., UNIQUE (X, Y, Z, ...) ON CONFLICT IGNORE); > > CREATE INDEX IX ON T(X); > > CREATE INDEX IY ON T(Y); > > CREATE INDEX IZ ON T(Z); > > . > > Not an answer to your question, but it's worth knowing that those > indexes may not be helping anything you're doing. Perhaps IX is indeed redundant (because it should coinside with the beginning of the implicit uniqueness index), but other indices are quite useful in doing quick selects of small subsets of data (e.g. SELECT MAX(Z) FROM T;), which is typical usage pattern in my application. >SQLite already has to create an index idea for coping with your UNIQUE >clause. So it already has an index on > > (X, Y, Z, ...) But this index is useless in searches which don't iclude X column. > Try removing those indexes and see if this improves matters. > It'll certainly make your filesize smaller. Not so significant. They altogether use up no more than one third of the total space, two other thirds being occupied by the table itself and by the implicit index respectively. Actually even less, because some columns don't have individual indices. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Slow inserts in indexed table
Hi, All! I've got a database containing single table with a dozen of columns and some indices like that: CREATE TABLE T(X, Y, Z, ..., UNIQUE (X, Y, Z, ...) ON CONFLICT IGNORE); CREATE INDEX IX ON T(X); CREATE INDEX IY ON T(Y); CREATE INDEX IZ ON T(Z); . Data in the different columns are small positive integers, 32-bit integers or few-bytes blobs. Neither of the X, Y, Z, ... are unique by themselves, only their combination is unique (that's why I opt using relational database to process them). My application treats this table as append-only, that is, doing either INSERT or SELECT on it, without any UPDATEs. Few millions of inserts are wrapped in a single transaction in order to reduce journal usage. Total number of records in the table is more than 10^10, so it doesn't fit in any RAM. PRAGMA CACHE_SIZE is adjusted to use most of the available memory. Journal file (of enough size to hold all the pages dirtied by a biggest transaction) is created in advance and PRAGMA JOURNAL_MODE is set to PERSIST in order not to bother operating system with creating/deleting files. Page size is matched to the underlying filesystem block size and to the stripe size of RAID containing that filesystem. Sqlite version is now 3.7.3, but it seems that exact version doesn't matter. When trying to insert data already present in the table, performance is fairly well, most of the CPU time is spent only on parsing SQL statements and converting data to internal format, while database operation itself (that is checking data against the constraint) is almost instantaneous, which is quite impressive given the table size. But when the application inserts new data, things change drastically: total throughput drops by a 2-3 orders of magnitude. CPU is staying almost idle, and all time is spent waiting for disk _reading_ (rarely interspersed with fast and happy write bursts on each COMMIT). What is sqlite reading there? Does it try to perfectly balance each index on each insert (million times per transaction) or something else? Sincerely, Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Shell bug: incorrect handling of some BLOBs.
Subsystem: Shell Title: Incorrect handling of some BLOBs. Type: Code_Defect Severity: Important Priority: Unknown Affected: All SQLite versions containing output_hex_blob() up to 3.7.9. Environment: Tested on various FreeBSD 8.x versions, both i386 and amd64. Probably, other operating systems would be affected as well. How-to-reproduce: Obtain FreeBSD and install sqlite3 by some of the standard ways (from a package, port, etc.). Then see screenshot (shell output indented for clarity): $ sqlite3 /tmp/db.tmp SQLite version 3.7.9 2011-11-01 00:52:41 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> create table t(v blob); sqlite> insert into t values(X'0123456789'); sqlite> .mode insert sqlite> select * from t; INSERT INTO table VALUES(X'01234567ff89'); sqlite> Description: In some circumstances (for example, when .mode insert is set and select results contain BLOBs) the shell calls internal function output_hex_blob(). This function uses wrong data types which could result in an implicit sign extension and thus data corruption. Suggested-fix: Following patch helps. Perhaps there might be better solution, such as reimplementing of the necessary fprintf(3) functionality without relying on the operation system libraries. --- src/shell.c.orig2011-11-01 16:31:18.0 +0400 +++ src/shell.c 2011-11-10 22:45:11.0 +0400 @@ -490,7 +490,7 @@ */ static void output_hex_blob(FILE *out, const void *pBlob, int nBlob){ int i; - char *zBlob = (char *)pBlob; + unsigned char *zBlob = (unsigned char *)pBlob; fprintf(out,"X'"); for(i=0; ihttp://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users