Re: [sqlite] How to join 3 tables?
On Tue, 2009-10-06 at 21:14 -0700, Darren Duncan wrote: > Kermit Mei wrote: > > How can I do it in only one sql command? I wrote it like this, but it > > can't run: > > sqlite> SELECT ZGroupItem.ZPhDevId HomeDev.text FROM ZGroupItem > >...> INNER JOIN ZPhDev ON ZGroupItem.ZPhDevId = ZPhDev.id > >...> WHERE ZGroupItem.groupId = 1; > > SQL error: near ".": syntax error > > > > How to solve it? > > Well, for starters you need to add a comma between the 2 items you have > between > SELECT and FROM. Simple syntax error. > >SELECT ZGroupItem.ZPhDevId, HomeDev.text >FROM ZGroupItem > INNER JOIN ZPhDev ON ZGroupItem.ZPhDevId = ZPhDev.id >WHERE ZGroupItem.groupId = 1; > > -- Darren Duncan Thanks, Darren Duncan, but this simple syntax error is by hand, in fact, you haven't found table HomeDev haven't been included whereby I must get HomeDev.text. Any way, thank you,all the same. And this question I have solved:) Best Regards Kermit Mei ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Problem with PAGE_SIZE and CACHE_SIZE in the sqlite356c-api
Hello, i understand this issue. But i have another question. The order of the steps, i have to do ist not clear for me. At the moment i do that: Make a new db-file in the windows explorer. Open this file in the console with sqlite3 db-file. Then set the page_size and the cache_size with the pragma statements. Then i i run my c-program, which makes following: Open the db with: sqlite3_open_v2(dstdbase, dbdst, SQLITE_OPEN_READWRITE, 0); Then create-statement and insert statement. then close the db. But when i check the page_size and the chache_size after the c-program is finished i get the old values, not the values which i have entered with the pragma. I think i make a misstake by the order of the steps. I hope anybody can help me. Thank you in andvance. Thork >-Ursprüngliche Nachricht- >Von: sqlite-users-boun...@sqlite.org >[mailto:sqlite-users-boun...@sqlite.org] Im Auftrag von D. Richard Hipp >Gesendet: Dienstag, 6. Oktober 2009 14:14 >An: General Discussion of SQLite Database >Betreff: Re: [sqlite] Problem with PAGE_SIZE and CACHE_SIZE in >the sqlite356c-api > > >On Oct 6, 2009, at 8:00 AM, Koston, Thorsten (ICT) wrote: > >> Hello, >> >> i have a problem with the sqlite 3.5.6 C-API. >> I want to change the page_size and the cache_size. >> >> I edit this values in the sqlite3.c file. Then i build the >sqlite new >> and start my application, whick uses sqlite. >> In the application i copy a table of a db to another db (so open an >> new db, select of the old db, insert in the new db). Nothing more. >> >> But when i read the values for page_size and cache_size of the new >> created db with terminal i get page_size 2048 und cache_size >2000 ( i >> use the pragma statement). >> And it doesn't matter, which values i enter in the sqlite3.c file. >> >> I think that should be only a bagatelle, but i have no idea. >> >> Is someone here, which can help me? > > >The page size for a database is fixed when the database is created. >It is a property of the database file, not a property of the >sqlite3 library. The page size in the sqlite3 source code is >merely the default page size used when creating new database files. > >The default page and cache sizes can be set by command-line >options when compiling. You should not need to modify the >source code. See: > >http://www.sqlite.org/compile.html#default_cache_size >http://www.sqlite.org/compile.html#default_page_size > > >D. Richard Hipp >d...@hwaci.com > > > >___ >sqlite-users mailing list >sqlite-users@sqlite.org >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Problem with PAGE_SIZE and CACHE_SIZE in the sqlite356c-api
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > Make a new db-file in the windows explorer. > Open this file in the console with sqlite3 db-file. > Then set the page_size and the cache_size with the pragma statements. You cannot change the page size of a database after it is created(*). When you call sqlite3_open and the file doesn't exist then a zero length file is created (to verify you can). SQLite doesn't write anything to the file until you create some data. So the steps are really simple: sqlite3_open("filename")-- creates zero length file exec("pragma page_size=xxx")-- sets page size to user exec("create table foo(a,b,c)") -- actually writes something to file (*) You can however back it up to a database with a different page size and I believe you can do something with vacuum too. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkrMTwkACgkQmOOfHg372QRWGQCfV8+VrloH3vSyDZ046Io01yjY RRgAoK/SrLoNFRIAxUQzrbSVQy5mW6Ge =awOq -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Problem with PAGE_SIZE and CACHE_SIZE inthe sqlite356c-api
Hi, thank you for your help. Now i can change the page_size and the autovacuum. But i can't change the cache size. It doesn't matter, what a value i enter in the pragma-statement in the sqlite-exec-command. The cache_size is always 2000. Have anybody a idea what i can do? Thank you in advance!! >-Ursprüngliche Nachricht- >Von: sqlite-users-boun...@sqlite.org >[mailto:sqlite-users-boun...@sqlite.org] Im Auftrag von Roger Binns >Gesendet: Mittwoch, 7. Oktober 2009 10:19 >An: General Discussion of SQLite Database >Betreff: Re: [sqlite] Problem with PAGE_SIZE and CACHE_SIZE >inthe sqlite356c-api > >-BEGIN PGP SIGNED MESSAGE- >Hash: SHA1 > >> Make a new db-file in the windows explorer. >> Open this file in the console with sqlite3 db-file. >> Then set the page_size and the cache_size with the pragma statements. > >You cannot change the page size of a database after it is >created(*). When you call sqlite3_open and the file doesn't >exist then a zero length file is created (to verify you can). >SQLite doesn't write anything to the file until you create >some data. So the steps are really simple: > > sqlite3_open("filename")-- creates zero length file > exec("pragma page_size=xxx")-- sets page size to user > exec("create table foo(a,b,c)") -- actually writes something to file > >(*) You can however back it up to a database with a different >page size and I believe you can do something with vacuum too. > >Roger >-BEGIN PGP SIGNATURE- >Version: GnuPG v1.4.9 (GNU/Linux) >Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > >iEYEARECAAYFAkrMTwkACgkQmOOfHg372QRWGQCfV8+VrloH3vSyDZ046Io01yjY >RRgAoK/SrLoNFRIAxUQzrbSVQy5mW6Ge >=awOq >-END PGP SIGNATURE- >___ >sqlite-users mailing list >sqlite-users@sqlite.org >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [Retrieved]Re: DB Corruption
On 10/6/09 15:47 , "Simon Slavin" wrote: > > On 5 Oct 2009, at 2:59pm, Reusche, Andrew wrote: > There are very many problems associated with synchronising two copies > of the same database. I'm serious: entire books are written about > just this one subject. I am currently facing a similar problem, and I was interested by this. Unfortunately a cursory search of Amazon or Google did not uncover such a book. Would you maybe suggest a reference? The problem I am currently tackling is the following: 1- I have p=1 (will always stays 1) central server running under mySQL v5.x. The use of mySQL is currently a given, but the server might in the future migrate to something else. I have no visibility or control over that, so let's keep mySQL in this context. 2- I have n (currently n=1, but it is likely to grow soon so assume n>1) "core clients" applications using a SQLite database to store their data. 3- I have p (currently p=0, it may or may not grow) "secondary clients" that might modify the data on the central server directly. The foreseen use case is a web application using PHP or some such. The central server is supposed to be the master reference. All clients are responsible for keeping up to date with the master server. I control: the core client application and I can impose any set of rules to any future secondary client. I also control the data models both on the core clients and on the server to the extent that I can add tables and columns to the client's domain data. I also may be able to impose constraints on the client's domain data if there are good reasons. So far my solution (currently about 80% done, but I expect the devil to show up in the last few %) basically does: - add two columns to all domain tables in the client. One, "central_id" is storing a central ID that is set by the server. Any record where central_id=0 is thus only local and needs to be sent to the server. The second, "stamp" is a modification stamp maintained by the server. Given two instances of the same record (where central_id is equal), the current one is the one with the higher stamp. Stamp can be implemented with a revision count or a time stamp or whatever. - replicate the domain tables to the central server with the same columns. An additional boolean column is added on the server only to be used as a "deleted" marker. - upon insert of a new record locally in one client my code sends it to the central database where both the remote_id and the stamp are set and returned to the local client, who stores them locally also. This is conceptually an "on insert" trigger. - upon modification of an existing record locally, the modifications are pushed to the central database where the stamp is "incremented" and returned to the client where is modified too. This is conceptually an "on update" trigger. - upon deletion of an existing record locally, the record is actually deleted from the client data, but it is only marked as deleted in the server. Its central_id is not made available. The data stays there but becomes invisible to the normal client behaviour since all queries in effect have an "and not deleted" condition. This is conceptually an "on delete" trigger. Management of central_id and stamp on the server is done purely on the server using mySQL autoincrement and triggers. Their management on the client is done by my code. - on a regular basis by polling AND before any action started by the user which might use any existing data (basically everything except a pure creation from scratch), bring the local database up to date with in essence a query similar to: SELECT * FROM aTable WHERE stamp > last_sync_stamp. Of course, last_sync_stamp is maintained locally per table. That's it and it's all very naïve. I'd appreciate some feedback here or pointers to litterature. Many thanks, Jean-Denis ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Problem with PAGE_SIZE and CACHE_SIZEinthe sqlite356c-api
Hello, i found the problem. I have to change the default_cache_size with the pragma, because the cache_size will be changed to the default value when the db is closed. Thank you for your help. >-Ursprüngliche Nachricht- >Von: sqlite-users-boun...@sqlite.org >[mailto:sqlite-users-boun...@sqlite.org] Im Auftrag von >Koston, Thorsten (ICT) >Gesendet: Mittwoch, 7. Oktober 2009 10:50 >An: General Discussion of SQLite Database >Betreff: Re: [sqlite] Problem with PAGE_SIZE and >CACHE_SIZEinthe sqlite356c-api > >Hi, > >thank you for your help. >Now i can change the page_size and the autovacuum. > >But i can't change the cache size. >It doesn't matter, what a value i enter in the >pragma-statement in the sqlite-exec-command. >The cache_size is always 2000. > >Have anybody a idea what i can do? > >Thank you in advance!! > > >>-Ursprüngliche Nachricht- >>Von: sqlite-users-boun...@sqlite.org >>[mailto:sqlite-users-boun...@sqlite.org] Im Auftrag von Roger Binns >>Gesendet: Mittwoch, 7. Oktober 2009 10:19 >>An: General Discussion of SQLite Database >>Betreff: Re: [sqlite] Problem with PAGE_SIZE and CACHE_SIZE inthe >>sqlite356c-api >> >>-BEGIN PGP SIGNED MESSAGE- >>Hash: SHA1 >> >>> Make a new db-file in the windows explorer. >>> Open this file in the console with sqlite3 db-file. >>> Then set the page_size and the cache_size with the pragma >statements. >> >>You cannot change the page size of a database after it is >created(*). >>When you call sqlite3_open and the file doesn't exist then a zero >>length file is created (to verify you can). >>SQLite doesn't write anything to the file until you create >some data. >>So the steps are really simple: >> >> sqlite3_open("filename")-- creates zero length file >> exec("pragma page_size=xxx")-- sets page size to user >> exec("create table foo(a,b,c)") -- actually writes something to file >> >>(*) You can however back it up to a database with a different >page size >>and I believe you can do something with vacuum too. >> >>Roger >>-BEGIN PGP SIGNATURE- >>Version: GnuPG v1.4.9 (GNU/Linux) >>Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org >> >>iEYEARECAAYFAkrMTwkACgkQmOOfHg372QRWGQCfV8+VrloH3vSyDZ046Io01yjY >>RRgAoK/SrLoNFRIAxUQzrbSVQy5mW6Ge >>=awOq >>-END PGP SIGNATURE- >>___ >>sqlite-users mailing list >>sqlite-users@sqlite.org >>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> >> >___ >sqlite-users mailing list >sqlite-users@sqlite.org >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] New open source data synchronization tool
Hello, We are trying to find an ETL tool open source. Basically, we need our software to perform ETL, data migration and data synchronization. The program should not be used on larger projects. A few open source tools are on the market. Some ideas? Thanks. -- View this message in context: http://www.nabble.com/New-open-source-data-synchronization-tool-tp25783393p25783393.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Synchronising multiple copies of a database
On 7 Oct 2009, at 10:13am, Jean-Denis Muys wrote: > [setup description] > > I'd appreciate some feedback here or > pointers to litterature. You see how complicated you had to get before you had an acceptable solution ? Two extra columns ? Timestamps ? Retention of records which have to be ignored ? I see nothing obvious wrong, so here are some general comments. Try really really hard just to have all sites access your MySQL database remotely. MySQL is an excellent client/server version of SQL. You could kill all the complexity you had to invent by just having everyone access your database live. Not only is the programming simpler, but you can back everything up in one go, and everyone has completely up-to-date data. If this solution cannot be implemented for you, try Keep a journal. Keep an unaltered central copy of the data. As each site contacts the central site, play that sites journal back against the unaltered central copy. The post-journal central copy of the database becomes the new copy for distribution. If these just cannot be done, you have to implement your own solution. Check that it handles these nasty scenarios for synchronisation: One site deletes a record then another site modifies it. e.g. Branch 1 sez "Our supplier withdrew the product 'Flinns' and created a new one."; Branch 2 sez "Our supplier renamed their 'Flinns' product to be 'Flanns' because 'Flinns' is a swearword in Hijinks. Each site then continues selling the new product. Headquarters considers these the same product, and wants sales figures added up, not considered separate. Superseding UPDATEs. Starting with three cars: black, red and green, one site resprays all black cars red and another site resprays all red cars green. After synchronising, depending on which update you get first you may or may not end up with a red car. Competing UPDATE and INSERT. One site creates new product for category L. Then Headquarters sends out a memo that category L is now category Q but it doesn't know about the new product yet. The site's copy of the database gets accurate data, but since Headquarters got the UPDATE first, then the INSERT, it somehow still has a category L product. If your system deals with those, it's most of the way there. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Problem with PAGE_SIZE and CACHE_SIZE in the sqlite356c-api
On 7 Oct 2009, at 8:47am, Koston, Thorsten (ICT) wrote: > Make a new db-file in the windows explorer. > Open this file in the console with sqlite3 db-file. If you do not do step 1, sqlite will do it for you. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [Retrieved]Re: DB Corruption
When I run pragma synchronous" I receive a 2. Documentation says this is the value for "Full" synchronous mode. I've checked this using the routine through our application, and also using sqlite3.exe. This is on a w32 system, if it matters. We experienced another "corruption" issue with the DB. Details below. When we tried to write to this DB, we received the "malformed image" error. Sorry I don't have the exact text. We have now received 4 problems with the database in 1000 installs since early July. The installs each reboot 2-3 times per day, and run every single day. Is this the typical failure rate you would expect to see? We are using a similar installation, also with SQLite, on 200 installations elsewhere for the past 12 months, and have not had any reports of this type of problem. I reviewed the syncing primary key code and it looks ok, but I am still suspicious, as that was the code change that seemed to immediately proceed these problems. - [0] {name="integrity_check" value="*** in database main *** On tree page 2801 cell 5: 2nd reference to page 2765 On tree page 2801 cell 5: Child page depth differs On tree page 2801 cell 6: Child page depth differs On tree page 2685 cell 65: 2nd reference to page 2766 On tree page 2685 cell 65: Child page depth differs On tree page 2685 cell 66: Child page depth differs Page 167 is never used Page 186 is never used Page 232 is never used Page 332 is never used Page 335 is never used Page 337 is never used Page 347 is never used Page 364 is never used Page 379 is never used Page 383 is never used Page 398 is never used Page 399 is never used Page 407 is never used Page 412 is never used Page 417 is never used Page 439 is never used Page 458 is never used Page 469 is never used Page 483 is never used Page 500 is never used Page 502 is never used Page 505 is never used Page 508 is never used Page 510 is never used Page 512 is never used Page 535 is never used Page 547 is never used Page 548 is never used Page 562 i sgi::DbColumn + name"integrity_check" std::basic_string,std::allocator > + value "*** in database main *** On tree page 2801 cell 5: 2nd reference to page 2765 On tree page 2801 cell 5: Child page depth differs On tree page 2801 cell 6: Child page depth differs On tree page 2685 cell 65: 2nd reference to page 2766 On tree page 2685 cell 65: Child page depth differs On tree page 2685 cell 66: Child page depth differs Page 167 is never used Page 186 is never used Page 232 is never used Page 332 is never used Page 335 is never used Page 337 is never used Page 347 is never used Page 364 is never used Page 379 is never used Page 383 is never used Page 398 is never used Page 399 is never used Page 407 is never used Page 412 is never used Page 417 is never used Page 439 is never used Page 458 is never used Page 469 is never used Page 483 is never used Page 500 is never used Page 502 is never used Page 505 is never used Page 508 is never used Page 510 is never used Page 512 is never used Page 535 is never used Page 547 is never used Page 548 is never used Page 562 is never used Page 608 is nev" std::basic_string,std::allocator > Andrew Reusche Software Engineer 678-297-5236 -Original Message- From: D. Richard Hipp [mailto:d...@hwaci.com] Sent: Tuesday, October 06, 2009 7:32 AM To: Reusche, Andrew Cc: General Discussion of SQLite Database Subject: Re: [Retrieved]Re: [sqlite] DB Corruption On Oct 5, 2009, at 9:59 AM, Reusche, Andrew wrote: > Thanks. We are beyond the point of recovering data at this point, we > will just start over with a fresh db. Two questions pertaining to > what > may have caused this: > > 1) We are trying to programmatically update the primary keys on this > table, in order to sync it up with a remote database. In the event > that > this needs to occur, we typically need to take a chunk of records in > this table, and increment all the "sales_id"'s - integer, primary > key - > to a higher number. Do you anticipate that this could cause a > problem? If the "from" range and the "to" range of the primary key overlap, you may bump uniqueness constraints. If you are asking if this can cause database corruption, the answer is "no". It is not possible to cause database corruption using SQL (valid or invalid) as far as we know, and this is very carefully tested. Please see http://www.sqlite.org/atomiccommit.html and http://www.sqlite.org/testing.html for further information. > > 2) In the routine mentioned above in 1), and in a number of various > inserts and updates, I found out that we were not calling "begin" or > "commit." From the documentation on > http://www.sqlite.org/lang_transaction.html it looks like a > transaction > is automatically started, so this should not be a problem. Do you > recommend that I call begin and commit anyway, or do you think that > will > not provide any additional benefit? BEGIN...COMMIT will mak
[sqlite] BUG: Async IO module works incorrectly with large database files
Hi! As anonymous users are not allowed to file bug reports on SQLite site anymore so I'm sending it here. I've encountered segmentation fault at sqlite3async.c:715. The problem is on sqlite3async.c:712: nCopy = (int)MIN(pWrite->nByte-iBeginIn, iAmt-iBeginOut); My actual numbers: iAmt = 16, iBeginOut = 2147844072. After subtraction we're getting -2147844056 or 7FFA8028 in hex which after truncating to int gives 7FFA8028, a positive number. And it blows away the next check if( nCopy>0 ). Pavel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Synchronising multiple copies of a database
On 10/7/09 11:50 , "Simon Slavin" wrote: > > On 7 Oct 2009, at 10:13am, Jean-Denis Muys wrote: > >> [setup description] >> >> I'd appreciate some feedback here or >> pointers to litterature. > Try really really hard just to have all sites access your MySQL > database remotely. Unfortunately this approach is not possible in the short term. The client applications are legacy applications, porting them to that scheme is a major undertaking. I need to insert my code within the legacy applications only in a way that as little invasive as possible. This trigger/polling approach was deemed a fair compromise. Overtime, new appls will probably be developped that work directly against the central database, and the legacy apps will be phased out progressively. Also those legacy apps also work when offline, which is a strong incentive to keep them (and I know this opens a new kind of can of worms). Finaly the central server is one or two stellar systems away, with bad latency and throughput. The local data repository in a way acts as a cache to keep the apps responsive. > > Keep a journal. Keep an unaltered central copy of the data. As each > site contacts the central site, play that sites journal back against > the unaltered central copy. The post-journal central copy of the > database becomes the new copy for distribution. Interesting idea, that makes a lot of sense in the "offline" scenario. > [snip scary scenarios] > > If your system deals with those, it's most of the way there. > I'll have to handle those cases very carefuly. Clearly, when two clients modify the global state in an incompatible way, some kind of conflict resolution must happen (similar to what Version Control Systems do). The proviso here is to make sure these conflicts are at least detected. I need to think about all this some more. Thanks a lot for your very valuable feedback. In any case, any book reference on this topic? Jean-Denis ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multiple NATURAL JOIN precedence?
I see my problem now. You said it below: "cartesian product". I believe "NATURAL JOIN" should be used with caution. I'll demonstrate my mistake with a small sample: CREATE TABLE t1 (x INTEGER PRIMARY KEY, a INTEGER NOT NULL); CREATE TABLE t2 (x INTEGER PRIMARY KEY, b INTEGER NOT NULL); CREATE TABLE t3 (y INTEGER PRIMARY KEY, c INTEGER NOT NULL); CREATE TABLE t4 (y INTEGER PRIMARY KEY, x INTEGER NOT NULL); SELECT COUNT(*) FROM t1 NATURAL JOIN t2 NATURAL JOIN t3 NATURAL JOIN t4; -- wrong... and slow due to cartesian product! SELECT COUNT(*) FROM t1 NATURAL JOIN t2 NATURAL JOIN t4 NATURAL JOIN t3; -- correct... and fast! So SQLite did the right thing here. I have one more question though: Q: When joining two tables, does SQLite choose the smaller one as the driving table? If I understand things correctly, this yields a performance gain. Thanks. > Date: Tue, 6 Oct 2009 20:33:17 -0700 > From: dar...@darrenduncan.net > To: sqlite-users@sqlite.org > Subject: Re: [sqlite] Multiple NATURAL JOIN precedence? > > Kristoffer Danielsson wrote: > > All my tables have unique column names, so I tend to use "NATURAL JOIN" for > > my joins (all join-columns are indexed). When I have multiple tables, I > > sometimes get incredibly slow queries (not sure if they even terminate). > > If all of your tables have unique column names, then a natural join would > degenerate to a cartesian product, because there are no column pairs across > different tables for which only rows having matching values are kept. A > cartesian product would almost always be slow regardless of your JOIN syntax. > I'll assume that you meant to say that all of your columns *except* the ones > you > are joining on, have unique column names, which makes more sense. Correct me > if > I'm wrong. > > > SELECT SUM(x) FROM t1 NATURAL JOIN t2 NATURAL JOIN t3 NATURAL JOIN t4 GROUP > > BY x ORDER BY x; -- takes forever, whereas: > > > > SELECT SUM(x) FROM t2 NATURAL JOIN t3 NATURAL JOIN t4 NATURAL JOIN t1 GROUP > > BY x ORDER BY x; -- takes one second > > Are all of those "x" supposed to be the same column? > > I don't think it makes sense to do all of [SUM(x), GROUP BY x, ORDER BY x] in > the same query. > > If the query is supposed to have exactly 1 output row, counting the number of > groups resulting from the GROUP BY, then the ORDER BY is useless, and makes > the > query slower (unless a smart optimizer eliminates it from ever running). > > If the query is supposed to have an output row for each distinct value of x > from > the GROUP BY, then SUM(x) would presumably be the same number as x for every > row. > > Did you mean this?: > > SELECT foo, SUM(bar) FROM ... GROUP BY foo ORDER BY foo > > -- Darren Duncan > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _ Windows Live: Gör det enklare för dina vänner att se vad du håller på med på Facebook. http://www.microsoft.com/windows/windowslive/see-it-in-action/social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:sv-se:SI_SB_2:092009 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multiple NATURAL JOIN precedence?
> Q: When joining two tables, does SQLite choose the smaller one as the driving > table? If I understand things correctly, this yields a performance gain. In fact this is not always true. For example let's say you're doing SELECT * FROM t1 INNER JOIN t2 ON t1.x = t2.x ORDER BY t2.y If t1 and t2 do not have any indexes then it will not matter much what table is driving one - SQLite will have to make cartesian join anyway, doing cnt_t1 * cnt_t2 iterations (if cnt_t1 is number of rows in t1 and cnt_t2 is number of rows in t2). If the only index in the system is on t2.y then making t2 driving table will have benefit no matter big it or small. If the only index is on t2.x then t1 is better to be driving no matter how big it is, although if t2 is too small benefit will be negligible. But the biggest benefit will be if there're 2 indexes - on t2.y and t1.x and if t2 is driving table. And it doesn't matter again if t2 big or small. But in case if you do another query: SELECT * FROM t1 INNER JOIN t2 ON t1.x = t2.x WHERE t1.y = c1 AND t2.z = c2 And you have 2 indexes on t1.x and t2.x then indeed having smaller table as driving table will be better. All these are pretty simple theoretical examples but I believe they all are implemented in SQLite optimizer. Pavel On Wed, Oct 7, 2009 at 9:08 AM, Kristoffer Danielsson wrote: > > I see my problem now. You said it below: "cartesian product". I believe > "NATURAL JOIN" should be used with caution. > > > > I'll demonstrate my mistake with a small sample: > > > > CREATE TABLE t1 (x INTEGER PRIMARY KEY, a INTEGER NOT NULL); > > CREATE TABLE t2 (x INTEGER PRIMARY KEY, b INTEGER NOT NULL); > > CREATE TABLE t3 (y INTEGER PRIMARY KEY, c INTEGER NOT NULL); > > CREATE TABLE t4 (y INTEGER PRIMARY KEY, x INTEGER NOT NULL); > > > > SELECT COUNT(*) FROM t1 NATURAL JOIN t2 NATURAL JOIN t3 NATURAL JOIN t4; -- > wrong... and slow due to cartesian product! > > SELECT COUNT(*) FROM t1 NATURAL JOIN t2 NATURAL JOIN t4 NATURAL JOIN t3; -- > correct... and fast! > > > > So SQLite did the right thing here. I have one more question though: > > > > Q: When joining two tables, does SQLite choose the smaller one as the driving > table? If I understand things correctly, this yields a performance gain. > > > > > > Thanks. > > > > >> Date: Tue, 6 Oct 2009 20:33:17 -0700 >> From: dar...@darrenduncan.net >> To: sqlite-users@sqlite.org >> Subject: Re: [sqlite] Multiple NATURAL JOIN precedence? >> >> Kristoffer Danielsson wrote: >> > All my tables have unique column names, so I tend to use "NATURAL JOIN" for >> > my joins (all join-columns are indexed). When I have multiple tables, I >> > sometimes get incredibly slow queries (not sure if they even terminate). >> >> If all of your tables have unique column names, then a natural join would >> degenerate to a cartesian product, because there are no column pairs across >> different tables for which only rows having matching values are kept. A >> cartesian product would almost always be slow regardless of your JOIN syntax. >> I'll assume that you meant to say that all of your columns *except* the ones >> you >> are joining on, have unique column names, which makes more sense. Correct me >> if >> I'm wrong. >> >> > SELECT SUM(x) FROM t1 NATURAL JOIN t2 NATURAL JOIN t3 NATURAL JOIN t4 GROUP >> > BY x ORDER BY x; -- takes forever, whereas: >> > >> > SELECT SUM(x) FROM t2 NATURAL JOIN t3 NATURAL JOIN t4 NATURAL JOIN t1 GROUP >> > BY x ORDER BY x; -- takes one second >> >> Are all of those "x" supposed to be the same column? >> >> I don't think it makes sense to do all of [SUM(x), GROUP BY x, ORDER BY x] in >> the same query. >> >> If the query is supposed to have exactly 1 output row, counting the number of >> groups resulting from the GROUP BY, then the ORDER BY is useless, and makes >> the >> query slower (unless a smart optimizer eliminates it from ever running). >> >> If the query is supposed to have an output row for each distinct value of x >> from >> the GROUP BY, then SUM(x) would presumably be the same number as x for every >> row. >> >> Did you mean this?: >> >> SELECT foo, SUM(bar) FROM ... GROUP BY foo ORDER BY foo >> >> -- Darren Duncan >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > _ > Windows Live: Gör det enklare för dina vänner att se vad du håller på med på > Facebook. > http://www.microsoft.com/windows/windowslive/see-it-in-action/social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:sv-se:SI_SB_2:092009 > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] BUG: Async IO module works incorrectly with large database files
On Oct 7, 2009, at 6:42 PM, Pavel Ivanov wrote: > Hi! > > As anonymous users are not allowed to file bug reports on SQLite site > anymore so I'm sending it here. > > I've encountered segmentation fault at sqlite3async.c:715. The problem > is on sqlite3async.c:712: > > nCopy = (int)MIN(pWrite->nByte-iBeginIn, iAmt-iBeginOut); > > My actual numbers: iAmt = 16, iBeginOut = 2147844072. After > subtraction we're getting -2147844056 or 7FFA8028 in hex which > after truncating to int gives 7FFA8028, a positive number. And it > blows away the next check if( nCopy>0 ). Thanks. Ticket now posted here: http://www.sqlite.org/src/info/94c04eaadb Dan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multiple NATURAL JOIN precedence?
Thanks for your clarification, Pavel! > From: paiva...@gmail.com > Date: Wed, 7 Oct 2009 09:28:37 -0400 > To: sqlite-users@sqlite.org > Subject: Re: [sqlite] Multiple NATURAL JOIN precedence? > > > Q: When joining two tables, does SQLite choose the smaller one as the > > driving table? If I understand things correctly, this yields a performance > > gain. > > In fact this is not always true. For example let's say you're doing > > SELECT * FROM t1 INNER JOIN t2 ON t1.x = t2.x ORDER BY t2.y > > If t1 and t2 do not have any indexes then it will not matter much what > table is driving one - SQLite will have to make cartesian join anyway, > doing cnt_t1 * cnt_t2 iterations (if cnt_t1 is number of rows in t1 > and cnt_t2 is number of rows in t2). If the only index in the system > is on t2.y then making t2 driving table will have benefit no matter > big it or small. If the only index is on t2.x then t1 is better to be > driving no matter how big it is, although if t2 is too small benefit > will be negligible. But the biggest benefit will be if there're 2 > indexes - on t2.y and t1.x and if t2 is driving table. And it doesn't > matter again if t2 big or small. > But in case if you do another query: > > SELECT * FROM t1 INNER JOIN t2 ON t1.x = t2.x WHERE t1.y = c1 AND t2.z = c2 > > And you have 2 indexes on t1.x and t2.x then indeed having smaller > table as driving table will be better. > > All these are pretty simple theoretical examples but I believe they > all are implemented in SQLite optimizer. > > > Pavel > > On Wed, Oct 7, 2009 at 9:08 AM, Kristoffer Danielsson > wrote: > > > > I see my problem now. You said it below: "cartesian product". I believe > > "NATURAL JOIN" should be used with caution. > > > > > > > > I'll demonstrate my mistake with a small sample: > > > > > > > > CREATE TABLE t1 (x INTEGER PRIMARY KEY, a INTEGER NOT NULL); > > > > CREATE TABLE t2 (x INTEGER PRIMARY KEY, b INTEGER NOT NULL); > > > > CREATE TABLE t3 (y INTEGER PRIMARY KEY, c INTEGER NOT NULL); > > > > CREATE TABLE t4 (y INTEGER PRIMARY KEY, x INTEGER NOT NULL); > > > > > > > > SELECT COUNT(*) FROM t1 NATURAL JOIN t2 NATURAL JOIN t3 NATURAL JOIN t4; -- > > wrong... and slow due to cartesian product! > > > > SELECT COUNT(*) FROM t1 NATURAL JOIN t2 NATURAL JOIN t4 NATURAL JOIN t3; -- > > correct... and fast! > > > > > > > > So SQLite did the right thing here. I have one more question though: > > > > > > > > Q: When joining two tables, does SQLite choose the smaller one as the > > driving table? If I understand things correctly, this yields a performance > > gain. > > > > > > > > > > > > Thanks. > > > > > > > > > >> Date: Tue, 6 Oct 2009 20:33:17 -0700 > >> From: dar...@darrenduncan.net > >> To: sqlite-users@sqlite.org > >> Subject: Re: [sqlite] Multiple NATURAL JOIN precedence? > >> > >> Kristoffer Danielsson wrote: > >> > All my tables have unique column names, so I tend to use "NATURAL JOIN" > >> > for > >> > my joins (all join-columns are indexed). When I have multiple tables, I > >> > sometimes get incredibly slow queries (not sure if they even terminate). > >> > >> If all of your tables have unique column names, then a natural join would > >> degenerate to a cartesian product, because there are no column pairs across > >> different tables for which only rows having matching values are kept. A > >> cartesian product would almost always be slow regardless of your JOIN > >> syntax. > >> I'll assume that you meant to say that all of your columns *except* the > >> ones you > >> are joining on, have unique column names, which makes more sense. Correct > >> me if > >> I'm wrong. > >> > >> > SELECT SUM(x) FROM t1 NATURAL JOIN t2 NATURAL JOIN t3 NATURAL JOIN t4 > >> > GROUP > >> > BY x ORDER BY x; -- takes forever, whereas: > >> > > >> > SELECT SUM(x) FROM t2 NATURAL JOIN t3 NATURAL JOIN t4 NATURAL JOIN t1 > >> > GROUP > >> > BY x ORDER BY x; -- takes one second > >> > >> Are all of those "x" supposed to be the same column? > >> > >> I don't think it makes sense to do all of [SUM(x), GROUP BY x, ORDER BY x] > >> in > >> the same query. > >> > >> If the query is supposed to have exactly 1 output row, counting the number > >> of > >> groups resulting from the GROUP BY, then the ORDER BY is useless, and > >> makes the > >> query slower (unless a smart optimizer eliminates it from ever running). > >> > >> If the query is supposed to have an output row for each distinct value of > >> x from > >> the GROUP BY, then SUM(x) would presumably be the same number as x for > >> every row. > >> > >> Did you mean this?: > >> > >> SELECT foo, SUM(bar) FROM ... GROUP BY foo ORDER BY foo > >> > >> -- Darren Duncan > >> ___ > >> sqlite-users mailing list > >> sqlite-users@sqlite.org > >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > _ > > Windows Live: Gör det enklare för dina vänner att se vad du
[sqlite] BUG: wrong type casting for constants in tcl interface
Hello! I find some incorrect types casting for constants in all SQLite versions. = test script = #!/usr/bin/tclsh8.5 package require sqlite3 sqlite3 db :memory: db eval { create table test(a int); insert into test values (1); } proc test {label sql} { global i puts -nonewline $label\t puts [db eval $sql] } set i 1 test 1.1 {select * from test where a=$i} test 1.2 {select * from test where 1=$i} ;# it doesn't work test 1.3 {select a from test where a IN (cast($i AS INT), 160)} test 1.4 {select a from test where 1 IN (cast($i AS INT), 160)} set i [db onecolumn {select quote($i)}] test 2.1 "select * from test where a=$i" test 2.2 "select * from test where 1=$i" ;# it doesn't work test 2.3 "select a from test where a IN ($i, 160)" test 2.4 "select a from test where 1 IN ($i, 160)" ;# it doesn't work test 3.1 "create view view_test1 as select * from test where a=$i;select * from view_test1" test 3.2 "create view view_test2 as select * from test where 1=$i;select * from view_test2" ;# it doesn't work test 3.3 "create view view_test3 as select * from test where a IN ($i);select * from view_test3" test 3.4 "create view view_test4 as select * from test where 1 IN ($i);select * from view_test4" ;# it doesn't work == = result = 1.1 1 1.2 1.3 1 1.4 1 2.1 1 2.2 2.3 1 2.4 3.1 1 3.2 3.3 1 3.4 == Of cource results of tests 1.1 and 1.2, 2.1 and 2.2, etc. must be equal. I sqlite3 shell all work correct. == sqlite3 shell sqlite> select * from test where 1=1; 1 sqlite> select * from test where 1=quote(1); 1 sqlite> select a from test where a IN (1, 160); 1 sqlite> select a from test where 1 IN (1, 160); 1 sqlite> select a from test where 1 IN (quote(1), 160); 1 sqlite> select a from test where a IN (quote(1), 160); 1 == P.S. I did report about this problem some times ago... Best regards, Alexey Pechnikov. http://pechnikov.tel/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] New open source data synchronization tool
>Hello, > >We are trying to find an ETL tool open source. Basically, we need our >software to perform ETL, data migration and data synchronization. > >The program should not be used on larger projects. A few open source tools >are on the market. >Some ideas? Thanks. What is ETL? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users The information contained in this e-mail is privileged and confidential information intended only for the use of the individual or entity named. If you are not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any disclosure, dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and delete any copies from your system. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] New open source data synchronization tool
On Wed, Oct 07, 2009 at 10:49:55AM -0500, Beau Wilkinson scratched on the wall: > >Hello, > > > >We are trying to find an ETL tool open source. Basically, we need our > >software to perform ETL, data migration and data synchronization. > > > >The program should not be used on larger projects. A few open source tools > >are on the market. > > >Some ideas? Thanks. > > What is ETL? http://www.google.com/search?q=ETL "Extract, Transform, Load." -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] BUG: Async IO module works incorrectly with large database files
Hi, Dan! I've found another bug in async io module. It happens only occasionally in my application but I've found how to perfectly reproduce it. You need to: - not call sqlite3async_run(); - open new not yet existing database; - execute PRAGMA journal_mode = PERSIST; - execute any 2 statements needing a transaction, e.g. some CREATE TABLE In this scenario (when connection uses async io module) last statement returns error SQLITE_CANTOPEN. I've investigated why it's happening. In first transaction when SQLite opens journal file async module determines that this opening should happen asynchronously and puts ASYNC_OPENEXCLUSIVE into the queue. Then at the end of transaction journal file is closed and everything is ok. When SQLite starts 2nd transaction it checks (inside hasHotJournal() ) for journal existence and async module returns "yes" because of ASYNC_OPENEXCLUSIVE in the event queue. Then SQLite tries to open journal for reading and async module determines that it should be done immediately. It try to physically open journal and fail because file does not exist yet. So SQLite decides that there is hot journal tries to open it again for read/write (inside sqlite3PageSharedLock() ) and fails again because file doesn't exist yet. And this error is already goes as a return code from sqlite3_step(). Pavel On Wed, Oct 7, 2009 at 9:57 AM, Dan Kennedy wrote: > > On Oct 7, 2009, at 6:42 PM, Pavel Ivanov wrote: > >> Hi! >> >> As anonymous users are not allowed to file bug reports on SQLite site >> anymore so I'm sending it here. >> >> I've encountered segmentation fault at sqlite3async.c:715. The problem >> is on sqlite3async.c:712: >> >> nCopy = (int)MIN(pWrite->nByte-iBeginIn, iAmt-iBeginOut); >> >> My actual numbers: iAmt = 16, iBeginOut = 2147844072. After >> subtraction we're getting -2147844056 or 7FFA8028 in hex which >> after truncating to int gives 7FFA8028, a positive number. And it >> blows away the next check if( nCopy>0 ). > > Thanks. Ticket now posted here: > > http://www.sqlite.org/src/info/94c04eaadb > > Dan. > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Feature request: extended last_insert_rowid(table_name) function
Hello! With this function the triggers like this CREATE TRIGGER view_user_insert instead of insert on view_user begin insert into user ... insert into user_attribute (user_id,...) values (last_insert_rowid(),...); -- user table _must_ have PK column insert into user_balance (user_id,...) values ((select max(rowid) from user),...); ... end; will be rewrited as CREATE TRIGGER view_user_insert instead of insert on view_user begin insert into user ... insert into user_attribute (user_id,...) values (last_insert_rowid(user),...); insert into user_balance (user_id,...) values (last_insert_rowid(user),...); ... end; Best regards, Alexey Pechnikov. http://pechnikov.tel/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [Retrieved]Re: DB Corruption
On 6 Oct 2009, at 6:19pm, Reusche, Andrew wrote: > We experienced another "corruption" issue with the DB. Details below. > When we tried to write to this DB, we received the "malformed image" > error. Sorry I don't have the exact text. We have now received 4 > problems with the database in 1000 installs since early July. You mean you have installations at 1000 sites ? Were the four failures all for the same site ? Can you list all the PRAGMAs your software uses -- in other words everything you do besides allow SQLite to use default settings for everything. > The > installs each reboot 2-3 times per day, and run every single day. Can I ask the reason for the frequent reboots ? Are the computers properly shut-down for the reboots or might they be rebooted when they're in the middle of something ? > Is > this the typical failure rate you would expect to see? The typical failure rate for calls to the SQLite library is zero. SQLite is an integral part of Mac OS X, every iPhone, every copy of Acrobat Reader, FireFox, McAfee anti-virus software and Skype. If there was a widespread corruption problem in SQLite databases, someone would probably have asked about it here. You may have a hardware problem, or a problem with some other level of your installation besides SQLite. For instance, hard disk firmware, or the OS, or faulty RAM. Obivously, it's not possible to prove this. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Synchronising multiple copies of a database
On 7 Oct 2009, at 1:47pm, Jean-Denis Muys wrote: > On 10/7/09 11:50 , "Simon Slavin" wrote: > >> Try really really hard just to have all sites access your MySQL >> database remotely. > > Unfortunately this approach is not possible in the short term. The > client > applications are legacy applications, porting them to that scheme is > a major > undertaking. [snip] I completely understand. The recommendation is valuable in the general case, but useless in yours. Still, that's why they pay you the big bucks: to write the complicated program. >> Keep a journal. Keep an unaltered central copy of the data. As each >> site contacts the central site, play that sites journal back against >> the unaltered central copy. The post-journal central copy of the >> database becomes the new copy for distribution. > > Interesting idea, that makes a lot of sense in the "offline" scenario. Standard solution to the synchronisation problem. The fault is that almost nobody does it right: they neglect to keep an 'unaltered central copy' and think they can cross-apply journals each time two databases talk to one-another. That does not work for various reasons. The synchronisation service built into Mac OS X (e.g. synchronising with online services or an iPhone/iPod) implements it in the correct manner. It takes extra data space and fussy programming but it does at least work right ! > [snip] In any case, any book reference on this topic? Since I joined this list and noticed repeated questions on the subject I have been trying hard to find any book with anything significant to say on the issue. I failed: everything I found was lacking in some way. Some were flat-out wrong. I work at a university and I think I'm going to ask the Computing people to find me someone who knows this stuff. I'm just paid to do it in real life, not read or write books about it. If I find something good I'll read it and post here about it. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] dump in-memory db to file in tcl
Hello! 1. Use tcl backup API The "backup" method The "backup" method makes a backup copy of a live database. The command syntax is like this: dbcmd backup ?source-database? backup-filename The optional source-database argument tells which database in the current connection should be backed up. The default value is main (or, in other words, the primary database file). To back up TEMP tables use temp. To backup an auxilary database added to the connection using the ATTACH command, use the name of that database as it was assigned in the ATTACH command. The backup-filename is the name of a file into which the backup is written. Backup-filename does not have to exist ahead of time, but if it does, it must be a well-formed SQLite database. The "restore" method The "restore" method copies the content a separate database file into the current database connection, overwriting any preexisting content. The command syntax is like this: dbcmd restore ?target-database? source-filename The optional target-database argument tells which database in the current connection should be overwritten with new content. The default value is main (or, in other words, the primary database file). To repopulate the TEMP tables use temp. To overwrite an auxilary database added to the connection using the ATTACH command, use the name of that database as it was assigned in the ATTACH command. The source-filename is the name of a existing well-formed SQLite database file from which the content is extracted. 2. Use sql commands to copy database structure from sqlite_master table and copy data 2006-02-20: A simple TCL-Implementation for loading a DB into memory: proc loadDB {dbhandle filename} { if {$filename != ""} { #attach persistent DB to target DB $dbhandle eval "ATTACH DATABASE '$filename' AS loadfrom" #copy each table to the target DB foreach {tablename} [$dbhandle eval "SELECT name FROM loadfrom.sqlite_master WHERE type = 'table'"] { $dbhandle eval "CREATE TABLE '$tablename' AS SELECT * FROM loadfrom.'$tablename'" } #create indizes in loaded table foreach {sql_exp} [$dbhandle eval "SELECT sql FROM loadfrom.sqlite_master WHERE type = 'index'"] { $dbhandle eval $sql_exp } #detach the source DB $dbhandle eval {DETACH loadfrom} } } Best regards, Alexey Pechnikov. http://pechnikov.tel/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Insert speed greatly decreasing over time
We have an application that uses a single table database, that is quite simple. Here is the schema: CREATE TABLE t (k varchar(50) primary key not null, d text not null, e datetime) We receive data over the course of an hour that is in a different form and we need to put into the db. The order of the data is random and it is bursty. The set of keys is stable in general. Throughout the course of an hour, all rows are eventually replaced. Currently, we use 3.6.16. We batch about 200 or so updates into a transaction. We use INSERT OR UPDATE to modify the data. The db grows to about 450mb; Page Size is 4096. When we start with an empty db the average insert time is around 0.3s/transaction. When the db is full, it rises to a high of 0.5s/transaction... for a while. Then, magically, after a few hours, that time jumps to 10s a transaction. We converted things to csvs and so forth to test through the sqlite command line client, and get the same results. All db access time becomes very slow from any tool. Using the analyzer, we see that when the db is working well, the index fragmentation is around 5% or less... then all of a sudden, we get 95-100% fragmentation of the index. When that happens, we get the horrible insert times, irrespective, it seems, of table fragmentation. The db can be full and have 10s of thousands of updates before this jump occurs.If I vacuum the db, of course everything works smooth, and it seems to take a much longer time for the fragmentation to occur. Currently, I am testing a model where we UPDATE (then INSERT if rows affected = 0) to see if this reduces the horrible index fragmentation we see. This method should cause less index stress since we don't delete then insert, nor modify keys in general. The question I have is: Is this normal? How do I prevent this massive random fragmentation? Vacuum is not a solution - the rate at which we'd have to vacuum seems unreasonable and we can't just delete the dbs and start over. The data needs to be present and speedily accessible in general; vacuum causes problems in this respect. I have seen another thread on this from a few months ago where someone saw this behavior then it seemed to go away. During the slowness of the db, one thing we do notice is that zero cpu is being used and almost no i/o is going on. Disk TPS is also very very low. When running the sqlite3 command line client by itself against one of these highly fragmented dbs shows this behavior. For instance a pragma integrity_check takes minutes to run. No i/o or cpu at all for a minute or two, then a big burst. Same with inserting through sqlite3 client -- when fragmentation of the index is low, it inserts very fast. When its high, it inserts slow, and we see trivial disk i/o / tps and no cpu. And yes, as stated earlier, vacuum fixes it.. for a short while. We see this behavior across multiple machines with different motherboards, cpus, etc. How can our pk index fragment so badly so quickly? Will the update method work better for this than insert or replace? Any help/thoughts on this would be appreciated. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Insert speed greatly decreasing over time
Hello! Try this: pragma cache_size=20; Best regards, Alexey Pechnikov. http://pechnikov.tel/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Synchronising multiple copies of a database
regarding this " The fault is that almost nobody does it right: they neglect to keep an 'unaltered central copy' and think they can cross-apply journals each time two databases talk to one-another. That does not work for various reasons." Would a central repository of journals that can be applied to local repositories be sufficient? I suppose I assume that running the same program on N workstations with the same set of journals should produce N identical results. On Wed, Oct 7, 2009 at 12:16 PM, Simon Slavin wrote: > > On 7 Oct 2009, at 1:47pm, Jean-Denis Muys wrote: > > > On 10/7/09 11:50 , "Simon Slavin" wrote: > > > >> Try really really hard just to have all sites access your MySQL > >> database remotely. > > > > Unfortunately this approach is not possible in the short term. The > > client > > applications are legacy applications, porting them to that scheme is > > a major > > undertaking. [snip] > > I completely understand. The recommendation is valuable in the > general case, but useless in yours. Still, that's why they pay you > the big bucks: to write the complicated program. > > >> Keep a journal. Keep an unaltered central copy of the data. As each > >> site contacts the central site, play that sites journal back against > >> the unaltered central copy. The post-journal central copy of the > >> database becomes the new copy for distribution. > > > > Interesting idea, that makes a lot of sense in the "offline" scenario. > > Standard solution to the synchronisation problem. The fault is that > almost nobody does it right: they neglect to keep an 'unaltered > central copy' and think they can cross-apply journals each time two > databases talk to one-another. That does not work for various reasons. > > The synchronisation service built into Mac OS X (e.g. synchronising > with online services or an iPhone/iPod) implements it in the correct > manner. It takes extra data space and fussy programming but it does > at least work right ! > > > [snip] In any case, any book reference on this topic? > > Since I joined this list and noticed repeated questions on the subject > I have been trying hard to find any book with anything significant to > say on the issue. I failed: everything I found was lacking in some > way. Some were flat-out wrong. I work at a university and I think > I'm going to ask the Computing people to find me someone who knows > this stuff. I'm just paid to do it in real life, not read or write > books about it. If I find something good I'll read it and post here > about it. > > Simon. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- VerifEye Technologies Inc. 905-948-0015x245 7100 Warden Ave, Unit 3 Markham ON, L3R 8B5 Canada ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Insert speed greatly decreasing over time
Would dropping and re-creating an index help? On Wed, Oct 7, 2009 at 2:19 PM, Alexey Pechnikov wrote: > Hello! > > Try this: > pragma cache_size=20; > > Best regards, Alexey Pechnikov. > http://pechnikov.tel/ > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- VerifEye Technologies Inc. 905-948-0015x245 7100 Warden Ave, Unit 3 Markham ON, L3R 8B5 Canada ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Insert speed greatly decreasing over time
> Try this: > pragma cache_size=20; I believe changing pragma cache_size will not help, because size of the database doesn't change, so cache hit ratio doesn't change too. And there's no disk i/o suggesting that it's not the bottleneck. Chris, did you try to attach to the process with some debugger to see what it is doing (at least in what function it's located) when there's no cpu and i/o? Pavel On Wed, Oct 7, 2009 at 2:19 PM, Alexey Pechnikov wrote: > Hello! > > Try this: > pragma cache_size=20; > > Best regards, Alexey Pechnikov. > http://pechnikov.tel/ > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Insert speed greatly decreasing over time
I have not done that yet; I'll get it into that state then attach to sqlite3's command line tool and see if I can learn something. I will let you know what I find. -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Pavel Ivanov Sent: Wednesday, October 07, 2009 2:23 PM To: pechni...@sandy.ru; General Discussion of SQLite Database Subject: Re: [sqlite] Insert speed greatly decreasing over time > Try this: > pragma cache_size=20; I believe changing pragma cache_size will not help, because size of the database doesn't change, so cache hit ratio doesn't change too. And there's no disk i/o suggesting that it's not the bottleneck. Chris, did you try to attach to the process with some debugger to see what it is doing (at least in what function it's located) when there's no cpu and i/o? Pavel On Wed, Oct 7, 2009 at 2:19 PM, Alexey Pechnikov wrote: > Hello! > > Try this: > pragma cache_size=20; > > Best regards, Alexey Pechnikov. > http://pechnikov.tel/ > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Synchronising multiple copies of a database
On 7 Oct 2009, at 7:20pm, Adam DeVita wrote: > regarding this > " The fault is that > almost nobody does it right: they neglect to keep an 'unaltered > central copy' and think they can cross-apply journals each time two > databases talk to one-another. That does not work for various > reasons." > > Would a central repository of journals that can be applied to local > repositories be sufficient? I suppose I assume that running the same > program on N workstations with the same set of journals should > produce N > identical results. You need a copy of the database which is not changed by any site. All the sites send in their journals. The journals are merged into a superjournal in time order. The superjournal is then applied to the central copy of the database. Then the updated database is sent back out to all sites. The problem comes when you apply multiple journals in a different order. Start with each site with identical copies of a TABLE with three clients: one managed by Mr. Green, one by Mr. Red, and one by Mr. Black. 'G R B'. Then, in this order ... Mr. Green goes on holiday ... Site A says that all Mr. Green's customers will be handled by Mr. Red. UPDATE clients SET contact = 'red' WHERE contact = 'green' Mr. Red goes on holiday ... Site B says that all Mr. Red's customers will be handled by Mr. Black. Then Mr. Green comes back from holiday, and Mr. Black goes on holiday so ... Site C says that all Mr. Black's customers will be handled by Mr. Green. Then they all synchronise databases. See if you can make them all end up with the same data if they synch against each-other rather than a central unaltered copy of the databases. Doesn't work: one site might have 'B B B', another 'R B R'. You can do it only by luck ... by happening to know in which order people went on holiday. However, if you always synch against a central unaltered copy of the database you can synch in any order. Once everyone has synchronised you distribute a copy of the central database to everyone and they all have identical data once more. That's the simplest setup. You can get more complicated by having each site remember which journals they've played back. The problem does not occur if any record can only ever be modified by one site. But if you have the normal 'anyone can do anything' setup, you have to be really really careful. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Insert speed greatly decreasing over time
Hello! Because update == delete + insert you must have fragmentation. The degree of the fragmentation is proporsional to count of replaces. So you may have problems after inserting a lot of dublicates. Are you really need to insert or update? 1. I'm using few hundread MB databases like CREATE TABLE telephony_log ( ... unique (nas_name,port,duration,origin,date_start) on conflict ignore ); Performance of "ignore" conflict resolution is more better. 2. May be unique index can be better than text PK 3. May be "insert into t select * from temp_t" can be helpful 4. You can mark records as deleted without deleting it immediate. And delete old records periodically whith vacuum after this operation. == PRAGMA auto_vacuum=0; pragma default_cache_size=20; CREATE TABLE t (k varchar(50) not null, d text not null, e datetime,is_new int default 1); create index t_k_idx on t(k); begin; CREATE TEMP TABLE temp_t (k varchar(50) primary key not null, d text not null, e datetime,is_new int default 1); insert into temp_t ... update t set is_new=0 where k in (select k from temp_t); insert into t select * from temp_t; commit; By cron: delete from t where is_new=0; vacuum; Best regards, Alexey Pechnikov. http://pechnikov.tel/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Synchronising multiple copies of a database
Simon Slavin wrote: > On 7 Oct 2009, at 7:20pm, Adam DeVita wrote: > > >> regarding this >> " The fault is that >> almost nobody does it right: they neglect to keep an 'unaltered >> central copy' and think they can cross-apply journals each time two >> databases talk to one-another. That does not work for various >> reasons." >> >> Would a central repository of journals that can be applied to local >> repositories be sufficient? I suppose I assume that running the same >> program on N workstations with the same set of journals should >> produce N >> identical results. >> > > SNIP > The problem comes when you apply multiple journals in a different > order. Start with each site with identical copies of a TABLE with > three clients: one managed by Mr. Green, one by Mr. Red, and one by > Mr. Black. 'G R B'. Then, in this order ... > > Mr. Green goes on holiday ... > Site A says that all Mr. Green's customers will be handled by Mr. Red. > UPDATE clients SET contact = 'red' WHERE contact = 'green' > > Mr. Red goes on holiday ... > Site B says that all Mr. Red's customers will be handled by Mr. Black. > > Then Mr. Green comes back from holiday, and Mr. Black goes on holiday > so ... > Site C says that all Mr. Black's customers will be handled by Mr. Green. > > Then they all synchronise databases. See if you can make them all end > up with the same data if they synch against each-other rather than a > central unaltered copy of the databases. Doesn't work: one site might > have 'B B B', another 'R B R'. You can do it only by luck ... by > happening to know in which order people went on holiday. However, if > you always synch against a central unaltered copy of the database you > can synch in any order. Once everyone has synchronised you distribute > a copy of the central database to everyone and they all have identical > data once more. That's the simplest setup. You can get more > complicated by having each site remember which journals they've played > back. > Simon, Isn't this a variation of the DVCS problem? In other words, would it be correct in saying that the underlying issue is treating this as a database problem, rather than it being a versioning problem which happens to involve a database? I ask because there are two separate projects which involve this sort of issue which I have simply deferred for the time being. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Synchronising multiple copies of a database
"You have to be really careful" Absolutely. Even if you know the order of updates (which I do). If site A updates an off line record in a cached copy after site B deletes it other sites can receive the change records in order and have the record re-appear (via insert or replace). One can also get a mess if Mr. Red and Mr Black both get new customers, and enter them and they both get the same ID because the auto-generated int happens to be the same. Both copies get updated with the other guy's data, they then get annoyed and enter the stuff again and it happens over again, but now there are N entries of the other guy's customer in the database depending on how many times they do it. On Wed, Oct 7, 2009 at 3:18 PM, Simon Slavin wrote: > > On 7 Oct 2009, at 7:20pm, Adam DeVita wrote: > > > regarding this > > " The fault is that > > almost nobody does it right: they neglect to keep an 'unaltered > > central copy' and think they can cross-apply journals each time two > > databases talk to one-another. That does not work for various > > reasons." > > > > Would a central repository of journals that can be applied to local > > repositories be sufficient? I suppose I assume that running the same > > program on N workstations with the same set of journals should > > produce N > > identical results. > > You need a copy of the database which is not changed by any site. All > the sites send in their journals. The journals are merged into a > superjournal in time order. The superjournal is then applied to the > central copy of the database. Then the updated database is sent back > out to all sites. > > The problem comes when you apply multiple journals in a different > order. Start with each site with identical copies of a TABLE with > three clients: one managed by Mr. Green, one by Mr. Red, and one by > Mr. Black. 'G R B'. Then, in this order ... > > Mr. Green goes on holiday ... > Site A says that all Mr. Green's customers will be handled by Mr. Red. > UPDATE clients SET contact = 'red' WHERE contact = 'green' > > Mr. Red goes on holiday ... > Site B says that all Mr. Red's customers will be handled by Mr. Black. > > Then Mr. Green comes back from holiday, and Mr. Black goes on holiday > so ... > Site C says that all Mr. Black's customers will be handled by Mr. Green. > > Then they all synchronise databases. See if you can make them all end > up with the same data if they synch against each-other rather than a > central unaltered copy of the databases. Doesn't work: one site might > have 'B B B', another 'R B R'. You can do it only by luck ... by > happening to know in which order people went on holiday. However, if > you always synch against a central unaltered copy of the database you > can synch in any order. Once everyone has synchronised you distribute > a copy of the central database to everyone and they all have identical > data once more. That's the simplest setup. You can get more > complicated by having each site remember which journals they've played > back. > > The problem does not occur if any record can only ever be modified by > one site. But if you have the normal 'anyone can do anything' setup, > you have to be really really careful. > > Simon. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- VerifEye Technologies Inc. 905-948-0015x245 7100 Warden Ave, Unit 3 Markham ON, L3R 8B5 Canada ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Synchronising multiple copies of a database
On 7 Oct 2009, at 8:33pm, John Elrick wrote: > Isn't this a variation of the DVCS problem? In other words, would it > be correct in saying that the underlying issue is treating this as a > database problem, rather than it being a versioning problem which > happens to involve a database? Yes yes yes. And look how complicated the programming is to do DVCS correctly. And now the problem is ... What is the resolution of your versioning ? Do put a version number on each row, or do you consider each column in each row to need its own version number ? What if one site changes one column and another changes another column of the same row ? How many version numbers do you want to store for just one table ? Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Synchronising multiple copies of a database
Simon Slavin wrote: > On 7 Oct 2009, at 8:33pm, John Elrick wrote: > > >> Isn't this a variation of the DVCS problem? In other words, would it >> be correct in saying that the underlying issue is treating this as a >> database problem, rather than it being a versioning problem which >> happens to involve a database? >> > > Yes yes yes. And look how complicated the programming is to do DVCS > correctly. > I would think the programming is relatively easy...the hard part is getting the rules right and being confident you've covered all the edge cases. > And now the problem is ... What is the resolution of your > versioning ? Do put a version number on each row, or do you consider > each column in each row to need its own version number ? What if one > site changes one column and another changes another column of the same > row ? How many version numbers do you want to store for just one > table ? > As many as necessary, but no more John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Datetime mystery
Dear list, I am sorry if I am asking a FAQ, but what is differnent with datetime() and time()? > date # This is the correct time on the system Ons 7 Okt 2009 23:56:36 CEST > sqlite3 temp.sqlite "SELECT datetime();" 2009-10-07 21:56:58 > sqlite3 temp.sqlite "SELECT datetime('now);" SQL error: unrecognized token: "'now);" > sqlite3 temp.sqlite "SELECT datetime('now');" 2009-10-07 21:57:13 > sqlite3 temp.sqlite "SELECT time('now');" 21:59:05 What happened here? How come the time functions are off 2 hours? (I am using sqlite version 3.5.9 on a Mac OS Leopard machine) /Fredrik -- "Life is like a trumpet - if you don't put anything into it, you don't get anything out of it." ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Datetime mystery
On Wed, Oct 7, 2009 at 5:02 PM, Fredrik Karlsson wrote: > Dear list, > > I am sorry if I am asking a FAQ, but what is differnent with > datetime() and time()? > >> date # This is the correct time on the system > Ons 7 Okt 2009 23:56:36 CEST >> sqlite3 temp.sqlite "SELECT datetime();" > 2009-10-07 21:56:58 >> sqlite3 temp.sqlite "SELECT datetime('now);" > SQL error: unrecognized token: "'now);" >> sqlite3 temp.sqlite "SELECT datetime('now');" > 2009-10-07 21:57:13 >> sqlite3 temp.sqlite "SELECT time('now');" > 21:59:05 > > What happened here? How come the time functions are off 2 hours? > (I am using sqlite version 3.5.9 on a Mac OS Leopard machine) > time zones. The sqlite returned times, by default, are UTC. -- Puneet Kishor http://www.punkish.org Carbon Model http://carbonmodel.org Charter Member, Open Source Geospatial Foundation http://www.osgeo.org Science Commons Fellow, http://sciencecommons.org/about/whoweare/kishor Nelson Institute, UW-Madison http://www.nelson.wisc.edu --- Assertions are politics; backing up assertions with evidence is science === Sent from Madison, WI, United States ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Datetime mystery
Hi, On Thu, Oct 8, 2009 at 12:04 AM, P Kishor wrote: > On Wed, Oct 7, 2009 at 5:02 PM, Fredrik Karlsson wrote: >> Dear list, >> >> I am sorry if I am asking a FAQ, but what is differnent with >> datetime() and time()? >> >>> date # This is the correct time on the system >> Ons 7 Okt 2009 23:56:36 CEST >>> sqlite3 temp.sqlite "SELECT datetime();" >> 2009-10-07 21:56:58 >>> sqlite3 temp.sqlite "SELECT datetime('now);" >> SQL error: unrecognized token: "'now);" >>> sqlite3 temp.sqlite "SELECT datetime('now');" >> 2009-10-07 21:57:13 >>> sqlite3 temp.sqlite "SELECT time('now');" >> 21:59:05 >> >> What happened here? How come the time functions are off 2 hours? >> (I am using sqlite version 3.5.9 on a Mac OS Leopard machine) >> > > > time zones. The sqlite returned times, by default, are UTC. > Yes, that would have been my guess too, but I am on CET, which I understand is UTC+1. So, I am still getting one hour less than I should from SQLite. Or, am I doing something stupid? /Fredrik "Life is like a trumpet - if you don't put anything into it, you don't get anything out of it." ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users