[sqlite] Fastest way to add many simple rows to a table?
Hi Eric, As I know you from Delphi related projects, I thought it would be nice to share my ideas. I am using a home-grown framework for sqlite persistence, optimized for storing records into tables. Main ideas: - using virtual tables (with the same structure) to speed up sqlite inserts a write translates to: insert into test("ID") select "ID" from temp.tmp_test - based on RTTI to generate optimized code for accessing/returning data - supports all common data (ints, floats, string, blob/TBytes) I tried your example and got ~0.45s for json and ~0.37s for my implementation; I have also tried using simple recursive cte (see code below) but it was slower, ~0.66s. I think most of the time is spent in packing the integers and building the btree(s) (i.e. CPU bound). Enabling the "name" field below ("storage" attribute) increased the processing time (my implementation only) to ~0.47, which I think confirms my interpretation. Interesting: when NOT using "primary key" (I have a flag for this), the times are ~ 0.26s vs 0.18s vs 0.47s ! Looks like sqlite is optimized to generate the rowid, and it is slower when explicitly set (even if it's the same value). I agree json works fast enough for such simple example, but I don't know how you could use it for e.g. 2 fields; Hope it helps, Gabriel Code: type TTestRecord = record [Storage('ID', [sfPrimaryKey])] //[Storage('ID')] ID : Int32; //[Storage('name')] name: string; end; procedure TForm2.Button2Click(Sender: TObject); var db : TSQLiteDatabase; tb : TVirtualTableDataArray; r : TTestRecord; i : integer; pf, t,t1,t2,t3: int64; json: TStringBuilder; begin tb := TVirtualTableDataArray.Create; json := TStringBuilder.Create; db := TSQLiteDatabase.Create; db.OpenWrite(':memory:'); tb.Setup(db.DB.DB, 'test'); tb.CreateTable; tb.VTInit; json.Append('['); for i := 1 to 100 do begin r.ID := i; r.name := i.ToString; tb.Items.Add(r); if i > 1 then json.Append(','); json.Append(i); end; json.Append(']'); QueryPerformanceCounter(t); db.db.ExecSQL('insert into test(id) select value from json_each(?)', [json.ToString]); QueryPerformanceCounter(t1); t1 := t1-t; tb.CreateTable; QueryPerformanceCounter(t); tb.WriteData; QueryPerformanceCounter(t2); t2 := t2-t; tb.CreateTable; QueryPerformanceCounter(t); db.db.ExecSQL('WITH RECURSIVE ' + 'cnt(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM cnt WHERE x<'+tb.Count.ToString+ ') insert into test(id) select x from cnt'); QueryPerformanceCounter(t3); t3 := t3-t; QueryPerformanceFrequency(pf); ShowMessageFmt('N=%d T1=%.2fs T2=%0.2fs T3=%.2fs', [tb.Count, t1/pf, t2/pf, t3/pf]); tb.Free; db.Free; json.Free; end; ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Select count(*)
I asked a similar question some time ago... See here: https://www.mail-archive.com/sqlite-users@sqlite.org/msg77488.html Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] YACC/Bison vs. Lemon vs. standard input
Hi, From the old time in school, Bison is a LL parser and I have see that Lemon is LR (LALR(1) ). That's it, Bison reduces as soon as it can while Lemon shifts as much as it can. You have to force Lemon to reduce "prog" rule, i.e. newline should be part of another rule: E.g. modified ladd.y: --> start ::= loop . loop::= prog NL loop . prog::= expr(a) prog . { printf("%d\n", a); } prog::= . expr(a) ::= INTEGER(b) .{ a = b; } expr(a) ::= expr(b) PLUS expr(c) . { a = b + c; } <-- Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite4 release date and how to compile on windows platform
Simon, I have asked several times similar questions, and got similar replies. This answer is simply NOT correct. While I understand nobody wants to commit to a release/stable version, I really wanted to evaluate how it fits my needs. Therefore I took some time to understand how could I compile it. I found the unpleasant truth: at this time (ok, a few weeks ago) Windows is not supported at all. The file lsm_unix.c is not ported to Windows (contains low level file access, shared memory, memory mapped files, etc). Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there a way to return the row number? (NOT the rowid)
Hi James, About complexity: I'm not sure it's NlogN; for each N you need to count N-1 columns, that's N^2 IMO. And you have an EXTRA temporary B-TREE? Doesn't it matter?? Although I don't really understand why, it has an index on it. My original concern is indeed simplicity and efficiency. But I'm NOT advocating any complex RANK function, but rather to expose (as pseudo-data) the row index IN THE CURRENT RESULT SET; while usually the result set is ordered, it doesn't really matter. This information HAS NO other technical meaning (like rowid). Like having about a list (on paper) containing (pseudo) RANDOM data and telling someone "look at 13th row"; by 13th I mean exactly as the list is printed. The other one can much easily spot 13 if available than to quickly count. If instead of row number I use "the row having xxx in it", the other one has to do some visual search (binary if the data would be sorted). Now all I say is, if I print the list from my code I can simply generate such info. But it would be greatly simplified if I can have it from SQL without complex/inefficient constructs, especially because it already does something similar for LIMIT. Of course I have to accept "no" as an answer if most people are against this idea. Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there a way to return the row number? (NOT the rowid)
Keith, I find this a little arrogant. I could say exactly the opposite, not caring about efficiency is ... (introduce whatever words you like). Why is this ranking addressed here? http://www.schemamania.org/sql/#rank.rows Why do you need "order by" in sql but may not ask the ranking having a given order?? Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there a way to return the row number? (NOT the rowid)
James, I reply from the web and I can't easily quote. I don't really want to argue whether it's a workaround or not. I understand perfectly that's valid standard sql. However please accept that the given sql is quite complex; you have to duplicate in the join clause the ordering... About the complexity, practical example (simplified from my case): CREATE TABLE temp.params (Chan INTEGER unique); insert into params values(2); insert into params values(4); insert into params values(3); insert into params values(1); explain query plan select count(lesser.rowid) as RANK, S.rowid, S.chan from params as S left outer join params as lesser on S.chan >= lesser.chan group by S.rowid order by S.chan This gives: SCAN TABLE params AS S USING INTEGER PRIMARY KEY SEARCH TABLE params AS lesser USING COVERING INDEX sqlite_autoindex_params_1 (Chan USE TEMP B-TREE FOR ORDER BY Compare to the simple usage: explain query plan select S.rowid, S.chan from params as S order by S.chan returns: SCAN TABLE params AS S USING COVERING INDEX sqlite_autoindex_params_1 I didn't even try, but AFAIK standard sql does NOT allow you to select additional columns if they are not in group by clause (which would break the logic), isn't it??? (though sqlite accepts it). Of course, there would be another sql solution for this too, using another join :) . Given all that, I will NEVER use the pure sql (if I can use any other solution). Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there a way to return the row number? (NOT the rowid)
James, Indeed, that's the case. However, I can't imagine this to be efficient. It's just a pure sql workaround to a counter. The only advantage is, it is standard sql and should work with any engine. I wonder if sqlite would make some optimizations out of it, otherwise it's O(n^2) WHEN having an index for the sort field. Update: with no index, sqlite uses only one temp b-tree which would be needed anyway for sorting. Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there a way to return the row number? (NOT the rowid)
Simon, Of course the rowid is THE field to use to uniquely identify a row (for edit, delete, etc...). I am talking about some ranking a result set. The ranking can be used for displaying (in HTML or a desktop GUI) or as source for insert/update sql. Of course in code you can have your own counter; what about writing a simple sql in the shell tool or any sql editor (I use SqliteSpy). There were lots of other questions about "re-assigning" the rowid to "natural" numbers (1..n); a solution would be to assign the rank to a field (another field, not the rowid). See the reply from James for the problem and a sql solution: http://www.schemamania.org/sql/#rank.rows How can this be efficient?? I don't use mysql, but Tony writes it can apparently do something like: SELECT @ROW:=@ROW+1 as ROW,* FROM table Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there a way to return the row number? (NOT the rowid)
Hi Simon, No problem, but I don't understand what has this to do with multi - processing / users?? (the query program has a lock on the table anyway). The counter can be either reused (see the other "LIMIT" discussion) or it needs a different implementation. As I don't think that the implementation is the problem, the real issues are: a) is it "lite" enough for the core? (I believe yes) b) is a specific (non-standard) feature desired? (I don't have a problem, but maintainers might have) While I agree non-standard features should be kept to minimum possible, I needed this so often that I simply think there is a larger potential benefit. Another example: I don't do web design, but I remember some WYSIWYG editors (Macromedia Dreamweaver) where it was easily possible to generate HTML tables (paged listings) from result sets. There was no code to write, just mark some fields from the result set. I'm sure other people can give more details about this... Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there a way to return the row number? (NOT therowid)
Hi, My opinion is a little different: a) of course the compiler needs to change the query program (only if "nrow" is requested/used) b) I don't know the internals, but I just can't believe that such a value could not be exported somehow c) I understand it would be non-standard; however there are lots of other specific features to sqlite d) in certain scenarios, the alternatives are quite difficult to achieve; e.g. you have a simple batch file to make some updates, no programming involved... I believe it's quite easy to do (technically), but it's just a matter of whether it's "lite" enough :) for the source code... Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there a way to return the row number? (NOT the rowid)
Simon, As I wrote it's easy to do when you control the loop. Sometimes there is either no explicit loop (direct sql, no programming code) or the loop is out of your reach (3rd party library). Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there a way to return the row number? (NOT therowid)
Then this register value is exactly the needed result. There is also the other syntax, "limit n, m"; you have to skip somehow "m" rows. Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there a way to return the row number? (NOT the rowid)
I also needed this for some special update queries. Without many details, it was some kind of "insert into xxx select , otherfields from source order by ". For this case there is a workaround, selecting first into a temporary table with auto generated rowid and using it afterwards for insert. There are lots of other cases where this would be handy, e.g. showing ordinal of some results. Of course it is simple to implement when the loop is under your control. But sometimes there are libraries (components) which are interfaced only by a cursor. E.g a listing/grid showing cursor results, where you can NOT control the implementation but want to show "row" as a column. I currently use another workaround for this, declaring a user function which simply increments/returns a counter. As long as the queries are simple it is ok, but wrong usage is easily possible (e.g. multiple times per row). As the sqlite query program already has an internal loop, it would be very easy to increment a counter for each "next" opcode and return it via a special function or pseudo column. I believe such a counter already exists for "LIMIT" clause, so most of the work is already done... Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select count(*) does not use primary key??
This is not at all my case ... I don't obviously write 1 by 1, but using blocks of data ( array of struct ), virtual tables wrappers, and "insert ... select". This way I can achieve >200k rec/s, or at least 100k when having some more fields. Right now I'm completely CPU bound, it's 100% load at high rate. IO is almost out of question, at <10MB /s; and I use 8k page size and of course synchronous off, wal mode... Another type of data (less fields but with a blob inside 2-32kB) easily reaches ~40MB/s but only a few thousands rec/s. The performance drops abruptly when having more fields (I don't remember the magic threshold); it seems most of the load is needed for field coding ? I use only integers for space optimization (varint); this is also good as I have high dynamic range. Multi-core sure helps to have enough CPU power for the rest (hardware connection, pre-processing, etc). I would definitely like to be able to get more performance, but I can live with the current numbers. One can use some high-end CPUs if really wants such high rates (the hardware around costs ~100x more :) ). BTW I asked a few times already, is it possible to get/compile a windows dll for sqlite4 (just for evaluation)? Last time I checked, it didn't compile on windows at all. Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select count(*) does not use primary key??
I admit I didn't think (or didn't even read in detail) about technical implementation. This is an extract from analyzer: *** Table AE_DATA Percentage of total database.. 99.89% Number of entries. 1030371 Bytes of storage consumed. 67846144 Bytes of payload.. 6118671990.2% Average payload per entry. 59.38 Average unused bytes per entry 0.34 Average fanout 752.00 Fragmentation. 0.35% Maximum payload per entry. 65 Entries that use overflow. 00.0% Index pages used.. 11 Primary pages used 8271 Overflow pages used... 0 Total pages used.. 8282 Unused bytes on index pages... 15678 17.4% Unused bytes on primary pages. 337429 0.50% Unused bytes on overflow pages 0 Unused bytes on all pages. 353107 0.52% So I understand that the 11 index pages are pure btree pages, but the leaves are actually in the ~8000 data pages. And it probably needs to visit (i.e. load) all data pages to count the leaves... Even if there would be some counter in the header of each page, it still needs to load the pages which is bad for IO... BTW I found this by opening some file over network, which of course made everything worse. For my case (file format) the data is append (write) only, so max(rowid) works equally good. As a note, I actually HAVE the record count stored somewhere else but I had this query in a generic copy routine which was also used for some other small tables. I agree it's some kind of corner case, usually tables have some kind of indices. But in this case I need high speed, indices would bring performance down. Not that I really need, but I have to support specified data rates up to 100k records / second. And I only access the data sequentially by rowid. Just for the sake of discussion: I imagine some hacks to the btree to optimize this special case. The btree nodes could store the number of leaves just for the data pages (e.g. 0: unknown, >0 valid number); it would need to propagate up the info just until it reaches a parent in an index page. And it needs to update this info only when a node changes from leaf to having a child. Thanks for all your time, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select count(*) does not use primary key??
Again sorry for count(a), I wrote too fast. I understand of course about null values. Otherwise by rowid I mean the autogenerated primary key. In my actual case, I have a field as alias. CREATE TABLE t(id integer primary key, a); explain query plan select count(*) from t -> scan table create index ia on t(id); explain query plan select count(*) from t -> SCAN TABLE t USING COVERING INDEX... 1. It means, the primary key is not as good as a cover index?? 2. Is there NO WAY to quickly get the row count WITHOUT full scan if I only have the auto primary key?? Thanks, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select count(*) does not use primary key??
OK I understand, then it remains the question why it does not use the primary key?? Thanks, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] select count(*) does not use primary key??
I was surprised to find that simple query "select count(*) from table" took too much time, and found that it does NOT use the primary key index?? e.g. CREATE TABLE t(a); explain query plan select count(*) from t I get : SCAN TABLE t (~100 rows) If I use CREATE TABLE t(a unique), then it uses the auto-generated cover index. Even if I write select count(rowid) from t it still uses scan table... However I would expect that it should also use the primary key for counting, or not?? In my opinion, count(*) is the same as count(rowid) (I see that even count() is accepted); I could say it's even the same as count(x) (any other field). Strange is, count(*) uses the cover index for a but "select count(a)" does NOT use the same cover index... Am I making any mistake here?? Thanks, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] experimental (better?) usage based sqlite cache
That's why I asked about feedback. My implementation is ~100 lines longer, so I think it's still "lite". There is nothing complex in it; apart from heap implementation, there are quite a few simplifications in the original code. Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] experimental (better?) usage based sqlite cache
I thought more about a "minus" (subtract minimum), but this might be a better option. Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] experimental (better?) usage based sqlite cache
I'm not sure we're talking about the same thing. For me caching here means avoiding IO, not memory and/or locks. The heap itself also needs some work, but logarithmic. The default value of 2000 pages cache should me enough for most useful pages (indices, roots...), having an overhead of max ~11 entries to update. Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] experimental (better?) usage based sqlite cache
Following a few other discussions, I had the feeling that sqlite should benefit from a cache which discards cached pages in a least frequently used order. It generally means, index pages, often used data pages, etc, should be preferred (meaning kept in memory) compared to some infrequent used pages. This helps me where I have big files which are mostly written once, but I also have some small tables with summaries; these should be better cached, the same for indices. I first implemented a custom cache in Delphi (Pascal) using some high level (generic) containers (hash for keys, heap for usage data); there is a significant overhead due to the classes I used and maybe also compiler differences. My own usage shows some visible improvements, therefore I took some time to implement it directly in core (pcache1). I would like to ask anyone who sees this interesting to try and give some feedback about benefits (if at all :)). Feedback / results / benchmarks are welcome. If it is useful, I would be happy to contribute it. The diff is done against 3.7.15.2 ; I'm not sure if it makes it to the list, so here is the diff text: http://pastebin.com/RrzqWjWv Technical details: - each page has a fetch counter - the LRU list is changed to a heap, arranged according to this counter - when discarding pages, the page with the minimum fetch counter is selected Apart from the heap operations, the other changes are quite straightforward. I run some tests to check for errors, maybe someone can check if the initialization is done in proper place (especially for shared cache group). There is an important catch; the fetch counter overflow. I don't have yet a definitive idea how/when to limit or to correct it. So this problem is currently postponed until the tests show actual benefit / interest. Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] query optimization with "order by" in a view
I admit I didn't check what the standards say about "select", I just wanted to make sure the potential users (which are by no means "developers") get the data properly. But you misread my example, I had "order by id" everywhere (no mixed sorting). I expected that the optimizer would "see" it's the same order and avoid doing it twice. Otherwise I can also admit that multiple/mixed "order by" would be problematic. Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite 4
The problem is, it is not ported to Windows as all... I made a quick (and dirty) port of lsm_unix, but other things (e.g. the environment) are also missing. Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] query optimization with "order by" in a view
It's hard to accept this conclusion... it seems like a simple justification. If you say so, why is "select from v order by id" not doing a sort (with the data from view)? Obviously it "sees" the id is the primary key and uses it for sorting. I read here lots of messages about complex query optimizations, and there is a whole chapter about this here: http://www.sqlite.org/optoverview.html#flattening I will obviously handle this somehow, but I hoped it was either a slip or would be relatively simple to implement. I have the feeling (at least for this case) that a simple rule (for the optimizer) is, only the last sort should be honored. That means, ignore the sort from the view; and because this case seems to be correctly handled Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite 4
I understand, but I wanted to make a performance comparison. I read some good news, but I need to test it for my case. Am in a situation where the bottleneck is the CPU (sqlite), not IO. Therefore I'm very interested in an early idea about performance. Even if it's not ready, I could at least prepare it better for a later switch. Gabriel -- Using Opera's revolutionary email client: http://www.opera.com/mail/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] query optimization with "order by" in a view
I need some clarifications on this issue. Here is a simplified example. There is a table: CREATE TABLE t(id integer primary key, data integer); and a (simplified) view: CREATE VIEW v as SELECT * FROM "t" order by id; I included the "order by" in view because it's meant for some end-users and I wanted to avoid mistakes. BUT I queried the view myself; I wanted to also be "safe" and included another "order by": explain query plan SELECT * FROM "v" order by id; As one can see, the plan uses 2 scans and an extra sort (btree)!! Which can be very expensive, of course... I could find the following: - if I query the view w/o "order by", it works as expected (uses primary key) - if I define the view w/o "order by", and use "order by" in query, it also works as expected - if I use the query with "order by rowid" (instead of id), the query plan is a little different; it has 2 scans, but it doesn't use a temporary btree anymore As a summary, it seems that having multiple "order by" disturbs the query builder; of course, I expected the "optimizer" to recognize that i was the same order and avoid extra sorting. Am I doing a mistake?? Thanks, Gabriel -- Using Opera's revolutionary email client: http://www.opera.com/mail/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite 4
I am also on final steps of a new project, and I would love to compare sqlite4 before release (the data files will be public, therefore a later switch would be problematic). Is there any chance to get it for windows? I usually need the dll. Last time I could not compile it (mingw), there are some memory mapping operations only for unix/linux (lsm_unix). They have equivalents for windows, so it should be possible to port it. Thanks, Gabriel -- Using Opera's revolutionary email client: http://www.opera.com/mail/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deletion slow?
Wondering how nobody suggested, did you try "PRAGMA synchronous = OFF" ?? For me it is always the default... I can imagine how slow such a combination can be. Just my 2c... Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] truncate after using chunk size and related...
Hi, I have some problems trying to truncate a db file to its "normal" size. I use sqlite as a file format in a mostly append only mode. Because I need to be prepared for high data rate, I chose to use chunk size to avoid file fragmentation (e.g, I use a chunk size of 4MB, but sometimes up to 100MB for blobs). But very often data is very small, i.e. a few kb; therefore I need/want to truncate it back to the "normal" size. What is the correct way of doing it? Note: I am using WAL mode during writing, and switching back on stop (when I also want the truncation). Another note: I did find a way to achieve this (in app code), but I'm not sure if it's guaranteed. Therefore I'm not even writing how :), waiting for better answers. Using "PRAGMA auto_vacuum" and related does NOT solve anything, because the file is just bigger; there are no pages in "freelist". Vacuum is also not really an option; copying data is very expensive (can be very large), when there is nothing really to do. I might also have readers, so closing and manually truncating the file is also not an option. This is extract of summary from analyzer: Page size in bytes 8192 Pages in the whole file (measured) 8 Pages in the whole file (calculated).. 8 Pages on the freelist (per header) 0 0.0% Size of the file in bytes. 65536 At this point my file is 4MB, and there is no way to reduce it the required size of 8 pages(see above). Enabling debug and reading the source code, I found that pager_truncate is never called because the line if( pPager->dbSize!=pPager->dbFileSize ){ does not pass condition; both values at this point (I added some logging lines) show 512 pages (= 4MB file / 8kB page size). I understand that dbSize is initialized based on file size on open, and never really shrinks (unless one really writes many pages, then delete them)... Further reading, I found a possible solution: Why shouldn't PRAGMA incremental_vacuum truncate the file even if there is no free list??? I added this code to sqlite3BtreeIncrVacuum, before sqlite3BtreeLeave (therefore working regardless of autoVacuum mode!): //gc: truncate if needed! > Pager *pPager = pBt->pPager; > if(pPager->dbSize > pBt->nPage){ >if( pPager->eState>=PAGER_WRITER_CACHEMOD || > pager_open_journal(pPager) == SQLITE_OK ){ > sqlite3PagerTruncateImage(pPager, pBt->nPage); >} > } > sqlite3BtreeLeave(p); > return rc; > Is there any problem with this approach?? It looks to work as I want, except it writes pages to journal before truncating. This looks completely unnecessary as they contain no data; especially when the chunk is 100MB (yes, I also have this case for blobs). Maybe someone with more knowledge can solve it, this way or another. Thanks for attention, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] force read schema after "delete from sqlite_master"
Yuriy Kaminskiy wrote: >Check >PRAGMA secure_delete; >(and disable if it was enabled by default; it would be rendered ineffective by your trick anyway). It is not enabled. >Probably there are way to make it work (altering PRAGMA schema_version; or >something), but I think it is way to hackerish and unsafe to use such tricks in >anything resembling production code. >Basically, you break consistency of your database image (and then kind-of-"fix" >it with vacuum;). Yes, and that's why I asked here. "The Truncate Optimization" talks about something, but it's still quite slow. To understand what I'm saying, please just try it; fill a dummy db until it's a few GB in size then try to "clear" it *quickly*. - Michael wrote: >Why don't you just attach another database and switch your user connections to that one? >Then you can just delete the old file and not worry about vaccum at all. Unfortunately this is not an option at this time; readers are different independent processes, which are just prepared to handle a truncate... >Sounds though like sqlite3 could use a "truncate" command like Oracle has which is the speedy way to zero out a table there. This is what I asked for; but even with "Truncate Optimization" it still takes quite some time. It might be hard to avoid, because it probably needs to go through the btree to free pages. But I really believe that a "clear" for the db should be relatively easy to do it *fast*; even if it's done without journaling, because it makes no real sense to journal a full delete (it means a full backup). This is what I done via "delete from sqlite_master" / "vacuum", but I needed to be safe, therefore inside a transaction... Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] force read schema after "delete from sqlite_master"
Rather than dropping all tables, why not just use DELETE FROM TABLE myTable with no clauses ? Also, I would guess that the thing which is taking most of the time is the VACUUM command. Do you >really need it ? Are you very short of filespace ? Are you about to make special backup copies of >the newly empty tables ? Simon. As I wrote, delete / drop takes a lot of time (try it on a big db). I need a way to say "quick drop everything", there is nothing to worry about. Vacuum is instant (after drop tables), which is normal for an empty db. While space is not a major problem, in my model it is at least unusual. I use sqlite as file format in an append only model; file size (pages) is a direct measure of how much data is available. In this model, I need the "truncate" operation. I can live with my workaround... Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] force read schema after "delete from sqlite_master"
I poked through the sqlite source code and tried to force "sqlite3ResetOneSchema" or "sqlite3SchemaClear"... and found no clear way to do it... I found however an workaround: before clearing the file I get the table list and execute a rename (alter) for each table using fake names. Then delete from sqlite_master. After this recreating the tables works fast as expected. After closing the transaction the database structure should be valid of any reader; i also call afterwards vacuum/checkpoint to trim the space. The fake tables/names are also simply lost after the whole process. While this it works for me, I still think it is a weakness. Regards, Gabriel -- Using Opera's revolutionary email client: http://www.opera.com/mail/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] force read schema after "delete from sqlite_master"
Hello, I have the following scenario: I need to "clear"/"initialize" a db file while potential readers are active (polling for data). The "normal" way to do it is begin a transaction, drop all tables, recreate tables, commit (vacuum to regain space). The biggest problem is that dropping a "very large" table (many GB) takes a lot of time. I could very well delete the file, but that fails if any reader has it open. I tried with delete from sqlite_master where type in ("table","view","index") (of course after setting writable schema) This works very fast, but I have a different problem... I can't recreate tables because it looks like sqlite still knows about them ?? Even if "select * from sqlite_master" returns nothing?? Dropping takes the same long time as originally did... It WORKS if I do a "vacuum" first, but it won't work inside a transaction... I had to wrap everything in a transaction to block readers seeing an "empty" file. Therefore I also can't close/reopen etc... Test this from shell in a test db: create table test(a); insert into test values(1); select * from sqlite_master; pragma writable_schema=1; delete from sqlite_master where type="table"; pragma writable_schema=0; select * from sqlite_master; create table test(b); vacuum; create table test(b); How can I solve this?? Any help appreciated. Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLITE3 64-bit version
I read about it, but I prefer to use dll binding. With the amalgamation, it's also quite easy to compile to one obj and link directly in Delphi (similar to jpeg; that's probably what you also do). No pun intended, why should I pay for it? Off-topic: what do I need to do for a proper reply?? Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLITE3 64-bit version
There is one more reason to use DLLs, I'm surprised noone mentioned it. What if you don't use C??? (I use myself Delphi with a header conversion). Gabriel -- Using Opera's revolutionary email client: http://www.opera.com/mail/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Using \"attach database\" to work around DB locking
Or you can do your immediate writing to a database in memory, and have anotherprocess dump memory to disk in the background. Depending on how recent youneed reading you can read the one in memory or the one on disk. It seems I have reached the CPU boundary (>90% one 1 core), not waiting for the disk anymore... Plus that I'm not using fsync ("pragma synchronous"), so the disk cache is in effect anyway. If I need more through-output, I might go multi-threaded write (if possible). However it would be interesting to know what's really doing; I have an "append" only usage. I think most of the time is spent in updating/maintaining the primary key btree, which is a simple "INTEGER PRIMARY KEY" with null on inserts - so the values are auto-generated. I am using a virtual table with a block of values (all fields except the rowid) and a "insert into select * from ". Is there a possibility to optimize this simple case (because the number of records is known, so all new rowids are virtually known)? Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Using "attach database" to work around DB locking
Hi, I have a similar problem; I need to reach writing 10 records/s and parallel reading without blocking the writer. While previously it was not possible (I was considering HDF5 for this), now I decided to go back to sqlite. Using WAL mode and lots of optimizations I am able to write >15 records/s (one table with ~20 fields) and at the same time reading(processing) >30 records/s. There are lots of tricks/variables (number of fields, page size, cache size, wal checkpoint size...) to achieve this. Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] file preallocation with SQLITE_FCNTL_SIZE_HINT questions...
Hello all, I'm currently designing a new file format using sqlite for some specialized hardware; previously it wasn't possible, but the new WAL mode fulfills my needs. Except for some small summary info, the data is appended only. I have some other simple requirement to preallocate a file size: I allocate some space (e.g. 1GB or more) in a db file on start recording data, and want to truncate to real size on stop recording. The initial size does NOT need to be precise in any way... It is not for any optimization or something, but as a simple way to guarantee that nobody/nothing messes with disk space while a recording takes place. I already searched about this, and people suggested other means of solving the problem (partitioning, etc); while it might be possible, this solution is very simple, works very well for my case and I was already using it for the previous file format. Additionally, it is already used internally in sqlite. A few months ago (I don't remember which version) I tried using SQLITE_FCNTL_SIZE_HINT, and today I tried again (with 3.7.10). I found that the behavior changed, therefore I have to ask here. The file allocation worked and works. But in the older attempt the file was quickly truncated to the real needed size as soon as some data has been written. In the new attempt, the file size remains as allocated no matter how much I write to it, and even if I close it. I can't seem to find a way to force it down, except using "vacuum" (which is not an option). Can anyone explain why? From the source code it seems that the "older" behavior was intended, but I can't quickly see why it is different now. I actually want a combination: leave it allocated (but extend if needed) until I close it (or specify something else)... Is my requirement that unusual?? I might work around using SQLITE_FCNTL_CHUNK_SIZE (which I'm already using to avoid fragmentation) and pass the allocation value. On stop/close I reset it and it properly truncates. However I feel that the SQLITE_FCNTL_SIZE_HINT is the proper way, so please comment on this. Thanks and regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] non-blocking concurrent read questions
Hello, Thanks for your attention, > Although speaking generally such method could be used in some situations, I > don't think it's good to allow to use it even with a "i know what I'm > doing" > pragma. Any structured file (sqlite is an example) have internal > dependencies. One of the reasons to block is to write different parts of > structured data together without intervention from other parties in order > to > keep the data integral. Imagine writing cache that kept changes for your > writer and finally it needed to flush the data and at the same time your > "anytime" reader started to perform some query in the middle of this > multiply pages writing process. I can't predict whether the reader will end > with some valid result or it will return with "database malformed" error. > > Instead consider changing your own logic. You wrote "without* any > possibility to be blocked". I suppose you already have a perfect writer > that > fits your needs, but if you post some info about the nature of your writer > and reader (records per second and something like this), it would help to > be > more specific with answers. > > Max The writer application must be failsafe, as much as possible (acoustic emission recording devices); I simply can not afford that a reader makes a select and because of a programming error the acquisition be blocked. I had this just by opening sqliteman. The recording rate is variable; using a test structure (~14 fields in 1 table, all integers 32/64bit) I was able to achieve ~9 records /sec with sqlite API, which was not really good enough (there are is no jurnal; I don't need any rollback / consistency check). Then I was able to make a virtual table wrapper and insert/select and reached ~17 rec/sec, which is already a big step forward. I think is not planned, but I would like to have some bulk insert API (not sql) to speed up things; hdf5 with packet table API reached ~7-8 00 000 rec/sec and is essentially IO bound. I do not expect this kind of performance from sqlite soon... Maybe cache settings might also help, suggestions are welcome. For reading it is much better; here I also used an workaround, a fake aggregated function ("store(...)") is working much faster (and simpler to implement than a virtual table). So the reader can read faster anyway, it just needs to "follow" the data; small delays are normal and acceptable. Back to the technical problem; roughly described, I would expect that a table is expanded first with new data then (on transaction end, I used blocks of ~1000 records) update some metadata information about table pages (which pages are used for the data). I image that it could be made somehow safe (I'm sorry but I have no idea about the actual implementation): write data (old data remains valid), invalidate page list (if necessary, one single value which would be atomic on most systems if properly aligned), write the new page list, then mark it as valid again (nothing affects current behavior when using existing locking anyway). The reader could read without any problem "old" data; it would only need to detect a "dirty" read if the page list is changed to be incompatible (if new pages are appended only, the page list might always be valid). In this case the reader would work in a short loop (similar to the current behavior when locking) trying to get a "clean" page list. A trivial example: page count = 5, page list = 1,2,3,4,5 after writing 2 new pages page list append 6,7 (does not change the old list entries) then set page count = 7 (atomic) The reader reads either 5 or 7; in any case, the page lists are valid. If necessary, a (transaction) counter might be used to detect "dirty" reads (when really incompatible). So the question are: how is expanding working? would it fit such model? Sure the indexes might create problems... I only use the build-in rowid during writing. Thanks again, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] non-blocking concurrent read questions
Hello again, I start with your final words, "it's a general database engine". On the main page it writes: "Think of SQLite not as a replacement for Oracle but as a replacement for fopen()" That's why I try sqlite and not other database (I actually tried embedded innodb but sqlite was muuuch better / faster from the first try). Now the rest. I'm not saying that I need 80 rec/s, just that I reached this speed with hdf5. Our devices are currently rated at "only" >3 "hits"/sec; therefore my results at >15 rec/sec are good enough for the start. I MIGHT need high data rate (>20MB/s), but that includes blobs (waveform) data. When using blobs sqlite can also achieve high numbers in MB/s (of course less records). Obviously we do have now a proprietary format; but as with any format, changes are required to make room for new features. We tried to design a much better, self describing, extensible file format; at the end I realize that it's not that far from existing solutions like hdf5 or sqlite. That's why I'm here... Currently I have only 1 binary structure (union), but I would change that to 2 or 3 tables, one for each record type. I don't need any specific indexes during writing (other than native order); for later analysis optimizations, I might create some other indexes. Similar to what you wrote about B-trees for pages, I assumed that indexes are equally problematic on writing. I do understand that it is not feasible to change much about how it works right now, to make it valid for concurrent access. But I still have the feeling that it could work pretty well using this kind of model (let's call it "optimistic" locking): for the writer: 1. on write begin (write lock), increase a special counter (n -> n+1, odd value); this would mark pending changes 2. write new data/pages/references as usual 3. on write end (release write lock), increase the counter again (even value); this would mark the page structure as valid for the reader: 4. on begin read (shared lock), read (and store) the counter; if odd, a writer is active and should return "busy" immediately (just like now). 5. read the page list in cache (I assume it is doing this right now) 6. read the counter again and compare with the initial value; if no change is detected then page list is valid and it can read existing data. If change is detected, it should signal "busy" just like 4. 7. when releasing the shared lock, the counter can be checked again (against the value read on 4) and signal whether the data was changed or not. It is up to the user to decide what to do with the "dirty" data. For points 4, 6 it should either fail, or trying like now in a loop for a certain period. Even better, the change counter could (or actually "should") be per table. So for writing there would only be trivial changes (simple counter increment); for reading there would be some simple read/check of a value before using the page list. These relatively simple changes would open it for other applications; by searching I saw that I'm not the only one who needs this kind of behavior. Regards, Gabriel Corneanu ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] non-blocking concurrent read questions
Sure I could have some kind of intermediate storage, but that would mean unnecessary data moving / copying. I really hope that I'll find some time and try to study the source and eventually implement my ideas (maybe others find it interesting and/or useful too). You said that only references are changed, right? That means, during appends the page content is still valid even if B-trees structure is changed because of references. PS: what should I add to make the messages as reply (I'm using the web interface for email)? Regards, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] non-blocking concurrent read questions
Hello everybody, I have the following situation: 1. a writer needs to continuously append some data in 1 or 2 tables, * without* any possibility to be blocked. 2. one (or eventually more) reader needs to read the data for analysis. Pt 1 is very important; therefore I use a "PRAGMA locking_mode = EXCLUSIVE" in the writer. Then of course the reader can NOT read anything. As I understood until now, there is no standard way to have real concurrent read/write in sqlite. I understand that in normal database operations a read uncommitted is not possible with sqlite (I am talking about different processes). In my situation I (try to) use sqlite as a data storage format. During writing there are only appends, no inserts (in the middle) or updates. (after the writing, standard database operations are applicable). The reader needs to poll and try to read new data from the file. I hope that using some usage restrictions, this might be possible. Therefore I tried to enable reading without locking. So found the "SQLITE_ENABLE_LOCKING_STYLE" define which might allow (value 2) exactly this behavior; unfortunately it seems to be used only for Unix files. Is there a reason not to enable this behavior via a special pragma (only together with read-only)? I prefer to have a run-time switch, to be able to use the same code everywhere. Otherwise I modified winLock function with this code (the second condition is mine): .. if( pFile->locktype>=locktype ){ return SQLITE_OK; } // gc: always allow shared lock! if(locktype==SHARED_LOCK){ pFile->locktype = SHARED_LOCK; return SQLITE_OK; } I did not changed winUnlock, because it doesn't care if unlocking fails (from shared lock). Of course this is not a full implementation, just a quick change to test if it would work. Then I tried in a loop with 2 programs to write / read in parallel and it seems to work without problems. Can anyone advise if this has any chance to work (or say it would definitely NOT work)? As a short summary: would it be interesting for anyone to enable read-only open with a special pragma to allow reading without locking (that means, shared locks being a noop)? PS. I also tried to use HDF5; it is faster, but I like the simplicity of sqlite (and my data is not that complex to require hdf5). Thanks for your attention, Gabriel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users