Re: [sqlite] inserts, performance, file lock...
all settings: NameValue Modified auto_vacuum noneFalse automatic_index on False cache_size 2000False case_sensitive_like off False collation_list [NOCASE], [RTRIM], [BINARY] False count_changes off False default_cache_size 2000False empty_result_callbacks off False encodingUTF-8 False foreign_keyson False freelist_count 13345 False full_column_names off False fullfsync off False journal_modedelete False journal_size_limit -1 False legacy_file_format on False locking_modenormal False max_page_count 1073741823 False page_count 14071 False page_size 1024False read_uncommittedoff False recursive_triggers off False reverse_unordered_selects off False schema_version 371 False secure_delete off False short_column_names on False synchronous fullFalse temp_store default False temp_store_directory False user_version0 False wal_autocheckpoint 1000False Nico Williams wrote: What's your page size? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- View this message in context: http://old.nabble.com/inserts%2C-performance%2C-file-lock...-tp32814772p32825288.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] GROUP BY With ASC/DESC
Given the following table: DROP TABLE IF EXISTS key_table5; CREATE TABLE key_table5 ( name TEXT NOT NULL default '', color TEXT default NULL, price REAL default NULL, UNIQUE (name), PRIMARY KEY (color) ); I'm getting the following exception: SQLException: [SQLITE_ERROR] SQL error or missing database (near ASC: syntax error) with: SELECT name, color, Sum(price) AS price FROM key_table5 WHERE price '2' GROUP BY color ASC ORDER BY name ASC LIMIT 50 OFFSET 0 when either ASC or DESC is used with the GROUP BY clause. Is this a bug or a query statement inaccuracy on my part? SQLite version 3.7.2 3.7.9. danap. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] GROUP BY With ASC/DESC
On 11 Nov 2011, at 2:23am, dmp wrote: I'm getting the following exception: SQLException: [SQLITE_ERROR] SQL error or missing database (near ASC: syntax error) with: SELECT name, color, Sum(price) AS price FROM key_table5 WHERE price '2' GROUP BY color ASC ORDER BY name ASC LIMIT 50 OFFSET 0 when either ASC or DESC is used with the GROUP BY clause. Your query appears to be mixing simple results and aggregate results. The point of GROUP BY is to be able to do things like SUM(). But you have in your SELECT the 'name' column and there's no way to tell which name would be returned for any group. Try this statement SELECT color, Sum(price) AS sumPrice FROM key_table5 GROUP BY color If that works, gradually add components of the query and figure out which one causes the problem. I might, for example, expect uncertainty when you change it to SELECT name, color, Sum(price) AS sumPrice FROM key_table5 GROUP BY color Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] GROUP BY With ASC/DESC
dmp da...@ttc-cmc.net wrote: I'm getting the following exception: SQLException: [SQLITE_ERROR] SQL error or missing database (near ASC: syntax error) with: SELECT name, color, Sum(price) AS price FROM key_table5 WHERE price '2' GROUP BY color ASC ORDER BY name ASC LIMIT 50 OFFSET 0 ASC is not valid with GROUP BY, only with ORDER BY. What is it supposed to achieve, anyway? Your query makes no sense to me. -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] GROUP BY With ASC/DESC
On 11 Nov 2011, at 1:15pm, Igor Tandetnik wrote: ASC is not valid with GROUP BY, only with ORDER BY. What is it supposed to achieve, anyway? Your query makes no sense to me. Igor, I started out to write exactly the same thing, but then I saw the diagram on http://www.sqlite.org/lang_select.html which suggests that ASC is legal, even if it doesn't make sense. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] GROUP BY With ASC/DESC
SELECT name, color, Sum(price) AS price FROM key_table5 WHERE price '2' GROUP BY color ASC ORDER BY name ASC LIMIT 50 OFFSET 0 when either ASC or DESC is used with the GROUP BY clause. Is this a bug or a query statement inaccuracy on my part? Looks like a documentation bug. Indeed the select-core diagram incorrectly shows an ordering-term in the GROUP BY clause. It should probably be named grouping-term instead, with grouping-term being defined as a comma-separated list of column expressions. Others have already commented on the semantic of your query. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] GROUP BY With ASC/DESC
On Fri, Nov 11, 2011 at 02:24:54PM +0100, Jean-Christophe Deschamps scratched on the wall: Is this a bug or a query statement inaccuracy on my part? Looks like a documentation bug. Indeed the select-core diagram incorrectly shows an ordering-term in the GROUP BY clause. It should probably be named grouping-term instead, with grouping-term being defined as a comma-separated list of column expressions. Yes and no. The railroad diagrams are not definitive, and are meant more as an illustrative guideline, then a strict grammar. They do not directly reflect the actual parse tree used by the SQLite parser. Further, while the diagrams try to represent what the parser will accept, that may or may not (as in this case) be a logical SQL statement. Not all of the statement validation is done at the parsing level. As someone that's gone through and modified almost every diagram for republishing, I can assure you that if you start to dig into things, it is not difficult to find instances where you can use the diagrams to create statements that make no logical sense, and are not accepted by SQLite. The diagrams alone cannot keep you from avoiding errors if you do not know the language and how it works. -j -- Jay A. Kreibich J A Y @ K R E I B I.C H Intelligence is like underwear: it is important that you have it, but showing it to the wrong people has the tendency to make them feel uncomfortable. -- Angela Johnson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] GROUP BY With ASC/DESC
On Fri, Nov 11, 2011 at 8:20 AM, Simon Slavin slav...@bigfraud.org wrote: On 11 Nov 2011, at 1:15pm, Igor Tandetnik wrote: ASC is not valid with GROUP BY, only with ORDER BY. What is it supposed to achieve, anyway? Your query makes no sense to me. Igor, I started out to write exactly the same thing, but then I saw the diagram on http://www.sqlite.org/lang_select.html which suggests that ASC is legal, even if it doesn't make sense. Documentation bug is now fixed. http://www.sqlite.org/docsrc/info/40ef9e8de8 Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] JOIN vs IN
Suppose I have two tables, and I want to have look for a value in the first table, but display the columns from the second table. The most obvious way would be joining them on rowid. But I don't need to SELECT any columns from the first table, and it's a FTS4 table (which always joins a bit slower than real tables), so I currently do: SELECT * FROM table2 WHERE rowid IN (SELECT rowid FROM table1 WHERE amount 500) It returns the same results, but it doesn't seem much faster. Is there any performance difference to be expected from using IN instead of JOIN, or does SQLite internally rewrite JOIN queries to something similar as IN, which would explain they perform nearly the same? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] GROUP BY With ASC/DESC
Given the following table: DROP TABLE IF EXISTS key_table5; CREATE TABLE key_table5 ( name TEXT NOT NULL default '', color TEXT default NULL, price REAL default NULL, UNIQUE (name), PRIMARY KEY (color) ); I'm getting the following exception: SQLException: [SQLITE_ERROR] SQL error or missing database (near ASC: syntax error) with: SELECT name, color, Sum(price) AS price FROM key_table5 WHERE price '2' GROUP BY color ASC ORDER BY name ASC LIMIT 50 OFFSET 0 when either ASC or DESC is used with the GROUP BY clause. Is this a bug or a query statement inaccuracy on my part? SQLite version 3.7.2 3.7.9. danap. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] GROUP BY With ASC/DESC
I think you want ORDER BY COLOR,NAME Since color is a primary key you can't have dups so what good is the group by? Michael D. Black Senior Scientist Advanced Analytics Directorate Advanced GEOINT Solutions Operating Unit Northrop Grumman Information Systems From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of da...@dandymadeproductions.com [da...@dandymadeproductions.com] Sent: Friday, November 11, 2011 9:57 AM To: sqlite-users@sqlite.org Subject: EXT :[sqlite] GROUP BY With ASC/DESC Given the following table: DROP TABLE IF EXISTS key_table5; CREATE TABLE key_table5 ( name TEXT NOT NULL default '', color TEXT default NULL, price REAL default NULL, UNIQUE (name), PRIMARY KEY (color) ); I'm getting the following exception: SQLException: [SQLITE_ERROR] SQL error or missing database (near ASC: syntax error) with: SELECT name, color, Sum(price) AS price FROM key_table5 WHERE price '2' GROUP BY color ASC ORDER BY name ASC LIMIT 50 OFFSET 0 when either ASC or DESC is used with the GROUP BY clause. Is this a bug or a query statement inaccuracy on my part? SQLite version 3.7.2 3.7.9. danap. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [bug] LIKE operator ignores rest of string after NUL character
Simon Slavin wrote: On 9 Nov 2011, at 8:03pm, Yuriy Kaminskiy wrote: Look at: SELECT hex(X'1245005679'),hex(X'1245001234'); And compare: SELECT X'1245005679' LIKE X'1245001234'; 1 -- incorrect SELECT X'1245005679' = X'1245001234'; 0 -- correct SELECT X'1245005679' X'1245001234'; 1 -- correct LIKE (in both native and icu implementations) ignores value length (sqlite_value_bytes) and stops at NUL character. Compare that with = and that compares full value. Arguable SQL_BLOB should be illegal operand for LIKE/GLOB, but then this should be documented and they should return error when called with BLOB argument instead of returning nonsense. The definition of LIKE on the expression page states (not with complete clarity) that LIKE operates on strings. So we're back to an old favourite: a decision as to what constitutes a string inside SQLite. One definition would say that a string would terminate with the first 0x00 no matter how many bytes are stored (C style strings). Another would say that the string terminates with the last byte stored (Pascal style strings). One way or other, =, LIKE and GLOB results should be consistent. If string is NUL-terminated, = should ignore everything after NUL. If string is length-terminated, LIKE should not ignore bytes after NUL. If devs have no time to change code and/or make decisions now, those cases should be at least documented as function/operation A,B,C (currently) have undefined behavior on BLOB. Not very nice, but randomly stumbling over such surprises is certainly worse. ... and few related strangeness: UPPER/LOWER/*TRIM/REPLACE also accept BLOB argument, but return *TEXT* instead. With somewhat unexpected result: sqlite SELECT length( X'41424300414243313233'); 10 sqlite SELECT length( lower(X'41424300414243313233')); 3 sqlite SELECT length(CAST(lower(X'41424300414243313233') AS BLOB)); 10 sqlite .mode insert sqlite select rtrim(X'4142430061626333',X'334363'); INSERT INTO table VALUES('ABC'); sqlite select CAST(rtrim(X'4142430061626333',X'334363') AS BLOB) INSERT INTO table VALUES(X'414243006162'); (BTW, one can consider that *TRIM, LIKE and GLOB semantic on BLOB should be different - with BLOB they should operate on *bytes*, not *utf-8 chars*; one more reason to state undefined behavior for now). Same with || operator (OP_Concat): it takes BLOB, but make result TEXT (leading to [arguable illegal] TEXT with embedded NUL {or with broken UTF-X encoding}: X'1234' || X'004567' [also look at http://permalink.gmane.org/gmane.comp.db.sqlite.general/68473, where this problem triggered invalid .dump]) SUBSTR also documented to operate *on string*, but, unlike above functions, when supplied with BLOB argument it correctly returns BLOB value. Not sure if it worth changing code, but certainly should be somehow mentioned in documentation (including clearly stated undocumented behavior on BLOB as option). I don't think any official definition is stated anywhere in the documentation intended for users. (I haven't looked at comments in the source code.) And I suspect that if DRH wants to decide one way or another, this may point up some inconsistencies in the codebase which should be quickly fixed. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Segment merging in FTS and updates-deletes
Hi, I'm studying the indexing mechanism of FTS3/4, I can pretty much understand how doclists, terms, segments are created and stored, but one thing I can't grasp is about updating and deleting docs and keeping up the index up to date. From the source comments: [quote] ** Since we're using a segmented structure, with no docid-oriented ** index into the term index, we clearly cannot simply update the term ** index when a document is deleted or updated. For deletions, we ** write an empty doclist (varint(docid) varint(POS_END)), for updates ** we simply write the new doclist. Segment merges overwrite older ** data for a particular docid with newer data, so deletes or updates ** will eventually overtake the earlier data and knock it out. The ** query logic likewise merges doclists so that newer data knocks out ** older data. [/quote] Its clear to me that with the way things are stored, it would be crazy to update all doclists with matches related to a single docid. I just don't see how a segment merge can possibly know which doclist is older/newer, other than by the level. What happens when a document stored in a level 0 segment (recently inserted) is updated or deleted? Which one will be kept and go up to a level 1 segment? -- View this message in context: http://old.nabble.com/Segment-merging-in-FTS-and-updates-deletes-tp32827350p32827350.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [bug] LIKE operator ignores rest of string after NUL character
On Fri, Nov 11, 2011 at 11:55 AM, Yuriy Kaminskiy yum...@mail.ru wrote: One way or other, =, LIKE and GLOB results should be consistent. If string is NUL-terminated, = should ignore everything after NUL. If string is length-terminated, LIKE should not ignore bytes after NUL. blob = blob should be a binary comparison blob = string should be a string comparison blob LIKE pattern should either treat the blob as a string or not, but I don't see why either behavior should imply that blob = blob should be anything but a binary comparison or blob = string anything other than a string comparison. I personally see no reason why LIKE shouldn't apply the pattern to the whole blob, but GLOB and REGEXP must apply only to strings (or BLOBs cast to TEXT). Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] inserts, performance, file lock...
On Thu, Nov 10, 2011 at 3:19 AM, yqpl y...@poczta.onet.pl wrote: i did some test do check if indexes make it slow. instead of inserting to disk database i use :memory: database - i have copied tables only - i assume without indexes and then do inserts - and it works the same. UNIQUE constraints on columns imply indexes. does it prove that it isnt because indexes? I don't understand what you mean by it works the same. Do you mean that it still slows down? Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] inserts, performance, file lock...
On Fri, Nov 11, 2011 at 1:39 AM, yqpl y...@poczta.onet.pl wrote: Nico Williams wrote: What's your page size? i have no access now to those files. but i didnt change any thing - so default. You really want to set the page size to something decent -- at least the filesystem's preferred block size (typically 4KB or 8KB). The page size has a lot to do with performance: a) it affects the number of meta-data nodes in the b-trees as well as the depth of the b-tree, which affects the number of reads needed to do random lookups or insertions, b) page sizes that are not whole multiples of the filesystem's preferred block size result in larger I/Os than the page size anyways because the filesystem will likely want to read the whole thing in. Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] GROUP BY With ASC/DESC
Message: 23 Date: Fri, 11 Nov 2011 10:57:22 -0500 From: da...@dandymadeproductions.com To: sqlite-users@sqlite.org Subject: [sqlite] GROUP BY With ASC/DESC Message-ID: 5ed601b698a020a8d790240cc05c8714.squir...@dandymadeproductions.com Content-Type: text/plain;charset=iso-8859-1 Given the following table: DROP TABLE IF EXISTS key_table5; CREATE TABLE key_table5 ( name TEXT NOT NULL default '', color TEXT default NULL, price REAL default NULL, UNIQUE (name), PRIMARY KEY (color) ); I'm getting the following exception: SQLException: [SQLITE_ERROR] SQL error or missing database (near ASC: syntax error) with: SELECT name, color, Sum(price) AS price FROM key_table5 WHERE price '2' GROUP BY color ASC ORDER BY name ASC LIMIT 50 OFFSET 0 when either ASC or DESC is used with the GROUP BY clause. Is this a bug or a query statement inaccuracy on my part? SQLite version 3.7.2 3.7.9. danap. -- Message: 24 Date: Fri, 11 Nov 2011 16:10:04 + From: Black, Michael (IS) michael.bla...@ngc.com To: General Discussion of SQLite Database sqlite-users@sqlite.org Subject: Re: [sqlite] GROUP BY With ASC/DESC Message-ID: ABF72471-0160-4FBE-A249-5CFBC96DD19B@mimectl Content-Type: text/plain; charset=iso-8859-1 I think you want ORDER BY COLOR,NAME Since color is a primary key you can't have dups so what good is the group by? Michael D. Black Senior Scientist Advanced Analytics Directorate Advanced GEOINT Solutions Operating Unit Northrop Grumman Information Systems Currently I'm implementing an interface for the MyJSQLView SQL GUI to use the GROUP BY aspect of SQL statements. Upon testing with SQLite I received the above indicated exeception. So the context given is only an example with no predefined objective of meaningful results. Perhaps a more meanful example with the same results: Given: ROP TABLE IF EXISTS General_Expenses; CREATE TABLE General_Expenses ( id INTEGER NOT NULL, date date NOT NULL, company TEXT NOT NULL, record_type TEXT NOT NULL, payment_method TEXT NOT NULL, account TEXT NOT NULL, description TEXT, cost REAL NOT NULL, PRIMARY KEY (id) ); The query with GROUP BY account works fine, but if ASC or DESC included with the field GROUP BY an exception is given. (fine) SELECT date, company, account, Sum(cost) AS cost FROM General_Expenses WHERE date LIKE '2011%' GROUP BY account ORDER BY account ASC (exception) SELECT date, company, account, Sum(cost) AS cost FROM General_Expenses WHERE date LIKE '2011%' GROUP BY account ASC ORDER BY account ASC SQLException: [SQLITE_ERROR] SQL error or missing database (near ASC: syntax error) SQLState: null VendorError: 0 Is this a bug, since the documentation indicates ASC and DESC can be used with GROUP BY statements? danap. Dana M. Proctor MyJSQLView Project Manager http://myjsqlview.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [bug] LIKE operator ignores rest of string after NUL character
On 11 Nov 2011, at 6:09pm, Nico Williams wrote: blob = blob should be a binary comparison blob = string should be a string comparison blob LIKE pattern should either treat the blob as a string or not, but I don't see why either behavior should imply that blob = blob should be anything but a binary comparison or blob = string anything other than a string comparison. I personally see no reason why LIKE shouldn't apply the pattern to the whole blob, but GLOB and REGEXP must apply only to strings (or BLOBs cast to TEXT). You know, actually the only one of those I think should work is the first: you can compare two blobs for identity. Anything which treats a BLOB like a string should yield an error of some kind (null ?). I think of a BLOB as a black box of some kind: you treat it as a whole, and don't look inside it. If you want parts of it you need to know whether to treat it as a string, a float, a colour, etc.. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Quickest way to get an answer
My db has a column called status. This can take one of 7 or so integer values from 0 to 7 or so. There are times when I need a quick answer to this question: are there any rows in the db for which status has value 0. I don't need to know how many, just whether there are any or not. What's the least expensive form of making this query? So far I've tried: select count(status) from mytable where status=0; select count(status) from mytable where status=0 limit 1; select status from mytable where status=0 limit 1; When doing this a number of times I see some seconds of CPU being taken; I haven't yet pinned it down to being an SQLite problem - I'm about to do some timings to see where the time is going. I've added an index: create index stat on mytable (status asc); and using the third form above together with an index seems to improve matters a bit. If all the forms above are roughly equivalent that would be helpful to know. There are 3000 or so rows in the table. -- Cheers -- Tim ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Quickest way to get an answer
On 11 Nov 2011, at 7:27pm, Tim Streater wrote: select count(status) from mytable where status=0; select count(status) from mytable where status=0 limit 1; select status from mytable where status=0 limit 1; When doing this a number of times I see some seconds of CPU being taken; I haven't yet pinned it down to being an SQLite problem - I'm about to do some timings to see where the time is going. I've added an index: create index stat on mytable (status asc); and using the third form above together with an index seems to improve matters a bit. The index is definitely going to improve this tremendously. I like your third form from the above: try to get just one line, and see whether you get one line or none. I think it'll work best. But you might want to compare its speed with select count(*) from mytable where status=0 The 'count(*)' expression gets treated specially. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [bug] LIKE operator ignores rest of string after NUL character
On Fri, Nov 11, 2011 at 1:16 PM, Simon Slavin slav...@bigfraud.org wrote: On 11 Nov 2011, at 6:09pm, Nico Williams wrote: blob = blob should be a binary comparison blob = string should be a string comparison blob LIKE pattern should either treat the blob as a string or not, but I don't see why either behavior should imply that blob = blob should be anything but a binary comparison or blob = string anything other than a string comparison. I personally see no reason why LIKE shouldn't apply the pattern to the whole blob, but GLOB and REGEXP must apply only to strings (or BLOBs cast to TEXT). You know, actually the only one of those I think should work is the first: you can compare two blobs for identity. Anything which treats a BLOB like a string should yield an error of some kind (null ?). I think of a BLOB as a black box of some kind: you treat it as a whole, and don't look inside it. If you want parts of it you need to know whether to treat it as a string, a float, a colour, etc.. I like the ability to cast blobs to text. I do agree that NULs in blobs must be treated as string ending NULs (if nothing else because TEXT in SQLite3 is supposed to be Unicode). But I would agree that an explicit cast should be required, that blobs and text should always compare as not equal unless a cast is involved. Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] inserts, performance, file lock...
yes still slows down. Nico Williams wrote: On Thu, Nov 10, 2011 at 3:19 AM, yqpl y...@poczta.onet.pl wrote: i did some test do check if indexes make it slow. instead of inserting to disk database i use :memory: database - i have copied tables only - i assume without indexes and then do inserts - and it works the same. UNIQUE constraints on columns imply indexes. does it prove that it isnt because indexes? I don't understand what you mean by it works the same. Do you mean that it still slows down? Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- View this message in context: http://old.nabble.com/inserts%2C-performance%2C-file-lock...-tp32814772p32828341.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] inserts, performance, file lock...
On Fri, Nov 11, 2011 at 2:38 PM, yqpl y...@poczta.onet.pl wrote: yes still slows down. Can you characterize it? All index inserts should slow down somewhat as the index grows since lookup and insertion will be O(logN) operations for b-trees, but also as your indexes grow larger than available memory you'll notice dramatic slow downs. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] VACUUM
On Fri, Nov 11, 2011 at 00:52:23 +0100, Fabian wrote: 2011/11/10 Richard Hipp d...@sqlite.org Because when you are inserting the Nth row, SQLite has no idea of how many more rows will follow or how big the subsequent rows will be, so it has no way to reserve contiguous space sufficient to hold them all. The result is that parts of the table and parts of the indices become interleaved in the file. But nothing is written to disk until I call commit (i'm using journal_mode=memory), so when SQLite has to start writing the data, it knows the exact total number of rows, and also that no other rows will follow. That's not how journals work. Or rather, it is the way wal journal works. All the other journal modes, inlcuding memory, work by writing the data directly to the database and storing information needed to return the database to the last consistent state in the journal. But then again, maybe the format of the journal in memory, is an exact copy of the bytes it will write to disk, and in that case I understand that it would be very inefficient to start shuffling things, instead of just dumping it. I pictured it like a temporary table, in which case it would be fairly easy to restructure things before writing. No, the journal does not contain the bytes that are going to be written to disk at all, ever. It contains the bytes that were on the disk before. Than the transaction is committed by simply deleting the journal and rolled back by writing the content of the journal back into the file. The only difference is the write-ahead log, wal, journal mode, that was introduced in version 3.7. In that case the the journal contains something like patches to be applied to the database. Transactions are committed by just marking the data valid in the journal and there is a special checkpoint operation that actually writes the data to the database file itself. It has the advantage that readers are not blocked by write and is often faster for small transactions, but it does not handle huge transactions (at $work we use sqlite to process huge datasets where one transaction often writes table with several million rows; wal is not much use in such context) -- Jan 'Bulb' Hudec b...@ucw.cz ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Segment merging in FTS and updates-deletes
On Fri, Nov 11, 2011 at 9:58 AM, nobre rafael.ro...@novaprolink.com.br wrote: Hi, I'm studying the indexing mechanism of FTS3/4, I can pretty much understand how doclists, terms, segments are created and stored, but one thing I can't grasp is about updating and deleting docs and keeping up the index up to date. From the source comments: [quote] ** Since we're using a segmented structure, with no docid-oriented ** index into the term index, we clearly cannot simply update the term ** index when a document is deleted or updated. For deletions, we ** write an empty doclist (varint(docid) varint(POS_END)), for updates ** we simply write the new doclist. Segment merges overwrite older ** data for a particular docid with newer data, so deletes or updates ** will eventually overtake the earlier data and knock it out. The ** query logic likewise merges doclists so that newer data knocks out ** older data. [/quote] Its clear to me that with the way things are stored, it would be crazy to update all doclists with matches related to a single docid. I just don't see how a segment merge can possibly know which doclist is older/newer, other than by the level. What happens when a document stored in a level 0 segment (recently inserted) is updated or deleted? Which one will be kept and go up to a level 1 segment? The lower-level segments are always newer, and the lower-numbered segments within a level are always older. So if you have a transaction which adds a document (causing a level-0 segment), then another which deletes that document (another level-0 segment), when the level-0 segments are merged into a level-1 segment the references to the original insert will be overwritten by the references to the delete. If a query happens before that merge, all of the doclists for the query term will merge together and the same thing will happen. Unfortunately, due to how things work those deletes have to stick around, even after they catch to originals. The problem is how updates are implemented. If you did an insert, an update, and a delete on a document, then the update can catch the insert and overwrite it, then the delete can catch the update, and all is fine. But if the delete catches the update first, the delete has to be retained because there could be an even earlier record in the system. One could conceive of a very similar system where deletes destruct with the original inserts, but it would require slight (format-incompatible) tweaks to the storage format. Or one could track documents in the system more explicitly and purge deletions that way. [Apologies if my fuzzy memory made this paragraph a fuzzy explanation.] -scott ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] HAVING and aggregate functions
Hi, I have a strange problem with an SQL query. I want to filter records by the result of a grouped column. SQLite doesn't complain but gives no result records in PHP PDO but the very same SQL does work fine in the SQLite shell on Windows. I have version 3.6.22 and an older 3.6 on the PHP side which does not work, and SQLite 3.7.4 on the shell side where it works. A simplified example of my query is like this: select name, count(*) cnt from things group by name having cnt 5 order by cnt desc What's the problem here? I wanted to try it with an older SQLite shell but I couldn't find any historic downloads on the SQLite website. -- Yves Goergen LonelyPixel nospam.l...@unclassified.de Visit my web laboratory at http://beta.unclassified.de ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] General question on sqlite3_prepare, the bind and resets of prepared statements
Embedded Sqlite3 questions: I want to load and prepare multiple statements, keep them prepared and when I want to use one of them, I will reset, bind and step. Can pre-prepare multiple independent statements, then run them one at a time at random? Thanks, this may be a newbie question for embeded sqlite3 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] General question on sqlite3_prepare, the bind and resets of prepared statements
Yes, works great! On 11/11/2011 4:24 PM, Matt Young wrote: Embedded Sqlite3 questions: I want to load and prepare multiple statements, keep them prepared and when I want to use one of them, I will reset, bind and step. Can pre-prepare multiple independent statements, then run them one at a time at random? Thanks, this may be a newbie question for embeded sqlite3 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] General question on sqlite3_prepare, the bind and resets of prepared statements
On 11/11/2011 7:24 PM, Matt Young wrote: Embedded Sqlite3 questions: I want to load and prepare multiple statements, keep them prepared and when I want to use one of them, I will reset, bind and step. Make it bind, step and reset. Don't leave a statement active for a long time - it keeps the transaction open. Can pre-prepare multiple independent statements, then run them one at a time at random? Yes. -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Quickest way to get an answer
select count(status) from mytable where status=0; select count(status) from mytable where status=0 limit 1; These two are identical because query always return one row thus making limit 1 a no-op. select status from mytable where status=0 limit 1; This one will have the best performance because SQLite won't have to find all rows matching the WHERE condition, it will be able to stop on the first row found. Pavel On Fri, Nov 11, 2011 at 2:27 PM, Tim Streater t...@clothears.org.uk wrote: My db has a column called status. This can take one of 7 or so integer values from 0 to 7 or so. There are times when I need a quick answer to this question: are there any rows in the db for which status has value 0. I don't need to know how many, just whether there are any or not. What's the least expensive form of making this query? So far I've tried: select count(status) from mytable where status=0; select count(status) from mytable where status=0 limit 1; select status from mytable where status=0 limit 1; When doing this a number of times I see some seconds of CPU being taken; I haven't yet pinned it down to being an SQLite problem - I'm about to do some timings to see where the time is going. I've added an index: create index stat on mytable (status asc); and using the third form above together with an index seems to improve matters a bit. If all the forms above are roughly equivalent that would be helpful to know. There are 3000 or so rows in the table. -- Cheers -- Tim ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] JOIN vs IN
On Fri, Nov 11, 2011 at 11:58 AM, Petite Abeille petite.abei...@gmail.com wrote: It returns the same results, but it doesn't seem much faster. Is there any performance difference to be expected from using IN instead of JOIN, or does SQLite internally rewrite JOIN queries to something similar as IN, which would explain they perform nearly the same? They should be equivalent in terms of cost. That said, you might want to use the 'exists' clause instead for the sake of clarity. No, exists in this case will change query plan significantly and performance can degrade drastically as a result. For the original question: it's not that SQLite rewrites JOIN queries to be as IN. It's just in your particular case both queries can be executed in the same way: find all rows in table1 with the necessary amount, for each row look into table2 and find rows with the same rowid. Query with JOIN however could be executed differently - for each row in table2 find all rows with the same rowid in table1 and then check amount in them. SQLite decided that this query plan will be less effecient. Pavel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] JOIN vs IN
On Nov 12, 2011, at 1:58 AM, Pavel Ivanov wrote: No, exists in this case will change query plan significantly and performance can degrade drastically as a result. Why would that be? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] JOIN vs IN
Fabian wrote: Suppose I have two tables, and I want to have look for a value in the first table, but display the columns from the second table. The most obvious way would be joining them on rowid. But I don't need to SELECT any columns from the first table, and it's a FTS4 table (which always joins a bit slower than real tables), so I currently do: SELECT * FROM table2 WHERE rowid IN (SELECT rowid FROM table1 WHERE amount 500) It returns the same results, but it doesn't seem much faster. Is there any performance difference to be expected from using IN instead of JOIN, or does SQLite internally rewrite JOIN queries to something similar as IN, which would explain they perform nearly the same? The logical operation you are doing is a semijoin, filtering table2 by matching rows in table1 (if you used NOT IN instead you would be doing an antijoin). A semijoin is most clearly expressed in SQL using the WHERE clause as you did, because the only purpose of table1 is to filter and not to return values from, as putting it in FROM would imply. Now because SQL is bag oriented rather than set oriented, using IN also helps because you avoid generating extra duplicates, whereas if you used the join method instead, then if any row in one table matched multiple rows in the other (because you weren't joining on a (unique) key of both tables), the result could have duplicate table2 rows, which probably isn't what you want. As to your performance question, any good DBMS should make both of your methods perform about the same, but that if they aren't the same, the IN version should always perform faster than the FROM version because with IN you only ever have to look at each row in table2 once; as soon as it finds any match you move on, rather than repeating for all possible matches. Note that semijoins and antijoins are what you have both when you have another select after the NOT/IN and when you have a literal list, such as IN (1,2,3). Note that any WHERE clause that consists just of ANDed equality tests, such as the common WHERE foo = 3 is also a trivial case of a semijoin where the table you are filtering on has exactly 1 row whose field value is 3, and ostensibly such WHERE clauses should also be optimizable. -- Darren Duncan ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Referencing subquery several times
Hi everybody, Here is a simplified version of the statement I try to run (let a, b, and m be tables with only one column named c containing integers): SELECT * FROM m WHERE c IN (SELECT * FROM a) OR c IN (SELECT * FROM b) AND (NOT c IN (SELECT * FROM a) OR c IN (SELECT * FROM b)); I know this could be done more easily, but I have to keep this structure (i.e. cannot JOIN for instance). My question is, how could I avoid repeating the subqueries after the AND NOT, since they will return the same set as the previous ones? In case I do not replace them, would SQLite be able to optimize and only run them once? Thanks, Alex. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] JOIN vs IN
On Fri, Nov 11, 2011 at 8:06 PM, Petite Abeille petite.abei...@gmail.com wrote: No, exists in this case will change query plan significantly and performance can degrade drastically as a result. Why would that be? How would you rewrite the query using exists? The only thing I have in mind is SELECT * FROM table2 WHERE exists ( SELECT 1 FROM table1 WHERE amount 500 AND table1.rowid = table2.rowid) And this query will force SQLite to use the second query plan I talked about: scan full table2 and for each row search in table1 for rows with the same rowid and check if it has necessary amount. And this plan will very likely be slower. Pavel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Referencing subquery several times
I know this could be done more easily, but I have to keep this structure (i.e. cannot JOIN for instance). My question is, how could I avoid repeating the subqueries after the AND NOT, since they will return the same set as the previous ones? With your restriction on query structure you cannot do that. In case I do not replace them, would SQLite be able to optimize and only run them once? No, SQLite doesn't have this sort of optimization AFAIK. Pavel On Fri, Nov 11, 2011 at 9:24 PM, Alexandre Courbot gnu...@gmail.com wrote: Hi everybody, Here is a simplified version of the statement I try to run (let a, b, and m be tables with only one column named c containing integers): SELECT * FROM m WHERE c IN (SELECT * FROM a) OR c IN (SELECT * FROM b) AND (NOT c IN (SELECT * FROM a) OR c IN (SELECT * FROM b)); I know this could be done more easily, but I have to keep this structure (i.e. cannot JOIN for instance). My question is, how could I avoid repeating the subqueries after the AND NOT, since they will return the same set as the previous ones? In case I do not replace them, would SQLite be able to optimize and only run them once? Thanks, Alex. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Referencing subquery several times
On 11/11/2011 9:24 PM, Alexandre Courbot wrote: Here is a simplified version of the statement I try to run (let a, b, and m be tables with only one column named c containing integers): SELECT * FROM m WHERE c IN (SELECT * FROM a) OR c IN (SELECT * FROM b) AND (NOT c IN (SELECT * FROM a) OR c IN (SELECT * FROM b)); This query doesn't make much sense. It appears that quite a few conditions are redundant, or else the parentheses are in the wrong places. What logic were you trying to express here? -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Any thoughts on optimizations for this query?
Hi all, We are struggling to find a way to rework this query in a way that performs efficiently for large data sets. The goal is to find the top x most active senders of e-mails within a date range. The killer of this query is the ORDER BY clause. Without it the results are quick and snappy. Obviously to get the LIMIT to provide useful results we have to use the ORDER. It seems so close to being able to work efficiently, if only it could be performing the LIMIT/sort while it's collating the groups. The magnitude of data we are working with is in the millions for both message and sender records and this can take hours to complete the query. Any ideas would be greatly appreciated. Thanks, Josh SELECT EMailAddress, COUNT(*) AS Total FROM senders INNER JOIN messages ON messages.message_ID = senders.message_ID INNER JOIN email_addresses ON senders.email_address_ID = email_addresses.email_address_ID WHERE CAST (messages.TimeStamp AS INTEGER) BETWEEN 0 AND 99 GROUP BY senders.email_address_ID ORDER BY Total DESC LIMIT 50 Table create statements: CREATE TABLE messages (message_ID INTEGER PRIMARY KEY AUTOINCREMENT, TimeStamp INTEGER); CREATE INDEX messages_timestamp_index ON messages (TimeStamp); CREATE TABLE email_addresses (email_address_ID INTEGER PRIMARY KEY AUTOINCREMENT, EMailAddress TEXT UNIQUE); CREATE TABLE senders (message_ID INTEGER, email_address_ID INTEGER DEFAULT NULL, FOREIGN KEY(message_ID) REFERENCES messages(message_ID) ON DELETE CASCADE); CREATE INDEX senders_emailAddressID_index ON senders (email_address_ID); CREATE UNIQUE INDEX senders_constraint_index ON senders (message_ID, email_address_ID); ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Any thoughts on optimizations for this query?
On 12 Nov 2011, at 3:43am, Josh Gibbs wrote: We are struggling to find a way to rework this query in a way that performs efficiently for large data sets. It's all about the indexes. The goal is to find the top x most active senders of e-mails within a date range. Do you frequently do this for many of random date ranges ? Or do you have a set of standard requirements, like 'Most popular posters today, this week, this month.' ? WHERE CAST (messages.TimeStamp AS INTEGER) BETWEEN 0 AND 99 It's a little late at night, so forgive me. What is this for ? What formats are the TimeStamps without CASTing ? Can you store the TimeStamps as integers, or even get rid of this entirely ? Oh hold on ... this is just for testing and in real life you set it to a smaller span ? ORDER BY Total DESC This is what you say is killing you. Just noting it. CREATE INDEX messages_timestamp_index ON messages (TimeStamp); Because you do a CAST in your 'WHERE' clause, I don't know whether this index would actually be used. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users