Re: [sqlite] Suppressing column (field) headers?
Hello, [EMAIL PROTECTED] 21/11/2004 04:56 PM Please respond to sqlite-users To: [EMAIL PROTECTED] cc: Subject:[sqlite] Suppressing column (field) headers? > 1. How can I get the dll to return JUST the desired data? >When I send the following SQL, for example, > select colEntry from tblEntry where colNum = 3 > the textbox shows > colEntry > "The text in the column." > This is being done by your delphi wrapper. Please consult its documentation. The "C" interface of sqlite (for both 2.8 series and 3.0 series) returns the column names and column values separately, and doesn't return any extra "quote" characters. Clay Dowling wrote: > You can get just the data that you want by using the prepared statements > feature of the 3.0.x dll. It sounds like you're using the table output > method for getting your results. This is fine, but it means that you need > it ignore the first row of data, which in the table interface is the > column labels. This statement makes you sound like you're taking crazy pills, Clay (prepared statements and non-prepared statements aren't functionally different in sqlite)... but I'm interested to know if you have a grain of truth hidden in there. Are you familiar with the delphi wrapper Mswarm is using? Which delphi wrapper are you using, Mswarm? What does the code that issues your query look like? Benjamin.
Re: [sqlite] Client/Server Environment
Hello, "Clay Dowling" <[EMAIL PROTECTED]> 09/11/2004 02:39 AM Please respond to sqlite-users To: [EMAIL PROTECTED] cc: Subject:Re: [sqlite] Client/Server Environment > Richard Boehme said: > > Hi there. How well does SQLite do in a client/server environment where > > the SQLite database is handling a batch of 100 inserts and the same > > number of queries a every 0.5 seconds? The inserts would be batched into > > a transaction. The queries would most likely not. > I'm not trying to run down SQLite here, but don't you think that this kind > of load justifies a pretty heavy duty database? SQLite is a disk based > database, not a client server database. For that kind of throughput, the > traditional choice has been a large scale client server database, and > there's been a pretty good reason why that's been so. If you are talking about scaling the application to a large number of machines to try and balance load, the big end databases may suit you. If you are talking about putting a big iron database to work on a single server and single disk array to get higher performance than sqlite... forget it. My figures are a little out of date now, but I looked for a high performance database for my own application a couple of years back and found that sqlite blew the popular alternatives[1] out of the water in data throughput. Since that time SQLite has become faster for many functions while other databases continue to get fatter. As with any performance-sensitive application you must profile the alternatives you are considering on your own target hardware with your own projected target load to get a reliable indication as to which will perform. I would certainly not discount sqlite based on a mailing list post. Try it out and see. I was pleasantly surprised, and suspect you will be also. Please report back, also. Your experience may be a useful data point for someone else in your position in future. >100 separate > clients trying to access the same file on disk is bound to cause a fiew > problems. The capacity of the database driver is probably the least of > your worries. 100 readers does not equal 100 separate clients trying to access the same file on disk. It may be equal to 100 clients trying to access the same piece of cache memory. That's certainly something you can do more than 100 times in half a second. The quality of your operating system will have some impact here, but I believe that even Microsoft can manage to do this properly these days. The main difference between the way sqlite handles performance and the way other databases do is that SQLite relies on the well-worn file operations of your operating system more than other databases. They try to implement these operations themselves. In bygone days this was a good thing because OS file handling was often poor. These days, It's a good thing. It keeps you lean and allows the operating system to do the heavy lifting in making you perform. If sqlite is going to be a problem under the proposed loading, it will most likely be related to locking and possible delays introduced by clients having to sleep to get their locks in. As presented, though, I suspect that won't be a huge problem. It would largely depend on the size and time-cost of the queries being performed. Benjamin [1] Mysql, postgres, sapdb, oracle.
Re: [sqlite] strange rounding problem
Donald, Will Leshner <[EMAIL PROTECTED]> 14/10/2004 04:20 AM Please respond to sqlite-users To: [EMAIL PROTECTED] cc: Subject:Re: [sqlite] strange rounding problem >The point is that the number is in the database as '358', but by the time we get it back from a query it has become '357.999'. We aren't doing any floating point with the number. It is (apparently) happening for us in the SQLite engine itself. But we depend on the number being '358' when we query for it and because it isn't, on some machines, our app fails. Can you come up with a minimal test case that exhibits this behaviour? Ideally, it would be a series of SQL statements written into the sqlite shell program that results in something visibly wrong. The second best case would be a short (10-20 line) C program that uses the sqlite API only to demonstrate the problem. Perhaps other users of this list could test on their own machines and come up with more information. Benjamin.
[sqlite] A quick code review, analysis of locking model, "fine-tuning" suggestions
G'day, I thought I'd so somewhat of a code review on the lower-level pieces of sqlite 3.0.1, so here goes: v- code review v You use a loop here to try and write all data synchronously to disk. I had to read it a couple of times before I realised it was actually performing the check on write's return correctly, but I am happy with it Is there a reason why similar logic doesn't appear in sqlite3OsRead? Background: Under unix, multiple reads/writes may be required for "slow" devices. One read/write should alway suffice for a "fast" device. A disk is considered a fast device for this purpose. An nfs mount would be considered a slow device. I've always been uneasy about the interaction in sqlite between the pager an os layer with regards to locking. It seems like excessive coupling to me. The os layer is essentially written to know the lock state of the pager, and then participate in a dance whereby the pager and os always approximately agree on the lock status. The sqlite3OsLock function is an example of this, where an attempt to grab a lock of a less than or equal status to the current lock is a no-op. To me this means that the pager is making uncessary calls and doesn't seem particularly aware of its own state. The os layer knows too much about what's happening ("I know that you didn't really mean to call this function"), and the pager doesn't know enough ("I'll make this call, just in case"). I think that all such no-op forms of os layer functions should be replaced with assertions and the code above made self-contained. The concept of upgrading and downgrading locks transparently has also consistently bugged me, especially when I've wanted to use blocking locks. It makes things much harder when the aren't clearly-defined individual lifecycles for reader and writer locks, and when writers are forced to share the early lifecycle of readers. Extraneous assert. The if condition tests this assert already. Use the #define value rather than numeric value of NO_LOCK. F_RDLCK should be replaced with F_WRLCK. Locking only with F_RDLCK has no effect. Use the #define value rather than numeric value of NO_LOCK. lock.l_len should be set to 1 when it needs to be that value. Doing it at the top of function reduces code clarity and introduces uncecessary asymmetry in conditional branches. I'm nervy about the different locks that might be held by an process in EXCLUSIVE_LOCK state depending on how it reached that state. If it went through SHARED_LOCK it has SHARED_FIRST through SHARED_FIRST + SHARED_SIZE write locked + PENDING_BYTE through PENDING_BYTE + 1 read locked. If it came through the reserved state it also has a write lock on RESERVED_BYTE through RESERVED_BYTE + 1. This appears to be dealt with in the unlock code, but it grates a little. I actually don't like the SHARED -> PENDING path at all, and think it should be removed for simplicity. It effectively creates two versions of the pending and exclusive states, respectively. I don't like the setting of PENDING_LOCK here. Surely the code would be clearer if it were set back when the pending lock was obtained. I understand that you still want access to the "old value" during the rest of the function, but couldn't you copy it? Ok, I haven't reviewed much past this point. I was hoping to get in some comments on the pager itself which I haven't read, yet... but I've been at this for a little too long now. v analysis of locking model v Just to get this straight in my head, this is the current unix locking model (does this appear somewhere in comments?): States NO_LOCK = Nothing locked SHARED_LOCK = SHARED_FIRST through SHARED_FIRST + SHARED_SIZE read-locked RESERVED_LOCK = RESERVED_BYTE through RESERVED_BYTE + 1 write-locked + locks of SHARED_LOCK state PENDING_LOCK = PENDING_BYTE through PENDING_BYTE + 1 read-locked + either locks of SHARED_LOCK state, or locks of RESERVED_LOCK state EXCLUSIVE_LOCK = SHARED_FIRST through SHARED_FIRST + SHARED_SIZE write-locked + locks of PENDING_LOCK state Transitions NO_LOCK -> SHARED_LOCK: 1. Pick up locks for pending state 2. Pick up locks for shared state 3. Drop locks for pending state SHARED_LOCK -> RESERVED_LOCK: 1. Pick up locks for reserved state. Reserved lock is exclusive, so only one process can be in reserved state at any one time but concurrency with readers is ok. SHARED_LOCK -> EXCLUSIVE_LOCK 1. Pick up locks for pending state 2. Pick up locks for exclusive state RESERVED_LOCK -> EXCLUSIVE_LOCK 1. Pick up locks for pending state 2. Pick up locks for exclusive state PENDING_LOCK ->
Re: [sqlite] database corruption
Hello, "D. Richard Hipp" <[EMAIL PROTECTED]> 24/06/2004 06:04 AM To: cc: [EMAIL PROTECTED] Subject:Re: [sqlite] database corruption > Michael Robinette wrote: > > ... > You present a new and novel approach to corrupting the database, which > is to combine a database file with a journal from a different database > into the same directory. We'll be thinking about what to prevent this > attack in the 6 days that remain before we freeze the 3.0.0 database > format. This is actually a variant of the method of corrupting the database that fsync()ing the directory containing your journal each commit is designed to solve. An unsynched directory entry may lead to the existence after a power failure of an old journal file, instead of the one that relates to the current database state. Obviously, this variant is a solved problem while others are not. The variant I'm most concerned about is actually a copy operation. User A says to himself "they're just files, I'll copy them onto my backup media". This will often appear to work, so he won't be concerned. One day he restores the files and "weird things" start happening. I'm not sure there's a solution to that, other than user education or an operating-system-level implementation of the journalling itself that treats a copy operation the same as other kinds of database reads. Ultimately the ideal world would have sqlite journalling built into the kernel vfs layer. Hrrmm... I've heard that windows longhorn might incorporate this kind of function. Perhaps we should be pushing for its introduction into other operating systems. It's really very compatible with other file operations where you might want to do operations that ensure readers always see a consistent state of the data. It might also make sqlite just a little touch lighter and more focused. Hrrm. http://www.namesys.com/faq.html;> However, although file data may appear to be consistent from the kernel point of view, since there is no API exported to the userspace to control transactions, we may end-up in a situation where the application makes 2 write requests (as part of one logical transaction) but only one of these gets journaled before the system crashes. From the application point of view, we may then end up with inconsistent data in the file. Such issues should be addressed with the upcoming ReiserFS v.4 release. Such an API will be exported to userspace and all programs that need transactions will be able to use it. http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html;> [...] one other thing that I want to do is to actually export the nested transaction API into userspace. You have to be very, very careful about that because it's not possible to guarantee proper database semantics. You can't have unbounded, large transactions. You have to have some way in which the user application can get in advance some idea of how many disk blocks it's going to need to modify for the operation, because it's going to call various things like that which are not entirely straight forward; it's not quite as simple as people would hope. But it's sufficiently useful that that will be exported to userspace at some point. http://lists.linux-ha.org/pipermail/linux-ha/1999-May/007901.html;> A user-visible transaction API is something entirely different. No way does it belong in the kernel. http://seclists.org/lists/linux-kernel/2003/Sep/1364.html;> There will be a new API to support userspace-controlled multifile transactions. At first stab, multifile transactions will be used internally to implement extended attributes. Now, another question is.. will the transaction API support commit() and rollback()? *grin* http://www.linuxjournal.com/article.php?sid=4466;> >From time to time, people ask for a version of the transaction API exported to user space. The ReiserFS journal layer was designed to support finite operations that usually complete very quickly, and it would not be a good fit for a general transaction subsystem. It might be a good idea to provide atomic writes to user space, however, and give them more control over grouping operations together. That way an application could request for a 64K file to be created in a certain directory and treat it like an atomic operation. Very little planning has happened in this area thus far. A full transaction API will probably never be exported by the kernel itself, however some basic hooks may eventually be provided if enough people can agree on what those hooks should be. Most of the work would be performed in user-space. Thoughts: * The breif period sqlite now has an inconsistent state in the main database made the copy scenareo less likely to be a problem, but the problem may occur occasionally. * In an embedded scenareo you
Re: [sqlite] RE: SQLite performance with mid-size databases
"Ismail Kizir" <[EMAIL PROTECTED]> 17/06/2004 06:48 PM To: <[EMAIL PROTECTED]> cc: Subject:Re: [sqlite] RE: SQLite performance with mid-size databases > I had the same problem as Richard's. > Unfortunately, i have decided to use postgres and finally back to mysql. If you're happy with your current solution, feel free to ignore the rest of this email. All one can hope for is to be happy with one's present solution :) > I have a table of ~300,000 rows of newspaper articles, and its fulltext > index tables. Sqlite doesn't support fulltext indexes. I suggest explaining what you were trying to achieve, and how. > Table opening time was very high. There's no such thing as table opening time in sqlite :) What was being done to sqlite during "table open time"? > Concurrency -even only for reading- was very very poor. What was the nature of your concurrent access? Were any writers active? Did readers use transactions? > Especially, i observed the linear write time just as Richard did. I'm not sure what you're saying here. A single insert took O(n) time where n was the size of the database, or O(n) time where n is the size of the row data? Was it really O(n), or was it O(log(n)) as modern databases such as sqlite will normally guarantee? How much data was in the row you were inserting? How did you measure the linear growth (is this in the archives somewhere?)? > I think SQLite is still a good solution for embedded system, but not even > for mid-size databases. I'm using sqlite in heavily-loaded system consisting of database files that are created over a two-day period. The average database file is about 800 meg. After extensive testing early in the piece I concluded that only sqlite was suitable. All alternative technoligies I tested were far too slow for my (now aging) sun hardware. I tested sqlite, postgres, mysql, and sapdb. Anecdotally, I'd have to say that SQLite is ideal for targeted mid-size databases. Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] SQLite performance with mid-size databases
Raymond, So far on this list I have only seen a reports of scalability problems with sqlite when the real problems were specific queries being given to sqlite that were constructed in a way that sqlite does not process well. In every such case I can recall an alternative form of the query was able to be produced (usually by DRH personally) that did not exhibit the problem. Knowing the code, there is no reason for sqlite to be scaling linearly in queries unless it is being forced to do table scans. It is true that sqlite doesn't optimise queries as well as major databases. It's not designed to. Users must take some care in constructing their queries and ensuring the queries are suited to sqlite's design if performance might be a problem. At the same time, such queries usually perform much better than those of major databases because of the vastly reduced optimisation and connection overhead that sqlite affords. As with every database technology. If you care about performance you have to understand some things about the design of your underlying technology. For sqlite the design is simple and the experts are extremely responsive. If you're having problems and can provide a clear, specifc description of your problem you will get help. While queries such as "I think sqlite scales linearly, but I can't tell you want queries I'm issuing to make that happen" and "My scroll list seems slow, but I don't know what is happening between the GUI and the database or what queries are going on" are unlikely to solicit helpful response, "I am issuing this query on that database schema with about 100 thousand rows each carrying 2k of data" is likely to be something members of this list can help you solve. I'd like to take this soapbox opportunity to again thank drh and his associates for the wonderful work they put into sqlite and their genine personal commitment to this project. Sqlite is a great product, and a well-targeted one. Benjamin Raymond Irving <[EMAIL PROTECTED]> 18/06/2004 12:09 AM To: [EMAIL PROTECTED] cc: Subject:Re: [sqlite] SQLite performance with mid-size databases Well this does not sound good at all. I would think that SQLite would at least do a better job at queries. The issues with performance and scalability is of great concern. I was planning on create some new apps that use SQLite, but now I'm wondering is this little database is capable of handle over 2 Gigabytes (GB) of data even though the docs says it can handle up to 2 Terabytes (TB). Does it really make sense to cut back on performance in order to keep the library size small? Will SQLite 3.0 fix these problems? __ Raymond Irving --- "D. Richard Hipp" <[EMAIL PROTECTED]> wrote: > Richard Kuo wrote: > > > > I suspect some unnecessary disk access has to be > the problem...despite > > the small amount of new guide information being > queried out, disk bytes > > read is several times higher than with MS access > and scrolling back over > > previously accessed areas of data is visibly > faster...indicating that > > the disk cache is very favorably impacting the > speed of the queries. > > > > If each of your rows contains 2K of data, that means > each database entry > is using about 2 overflow pages. You can change > this by increasing the > page size. Try recompiling SQLite after changing > the SQLITE_PAGE_SIZE > macro to 16384. > > You might also trying switching to SQLite version > 3.0.0 which will be > released tomorrow. > > Also tomorrow, I will be making available a database > analysis tool > for version 2.8 databases that will help us to > better understand > how information is stored on disk for your database, > and possibly > provide some clues about why you are having > problems. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Locking and concurrency in SQLite version 3.0
G'day, "D. Richard Hipp" <[EMAIL PROTECTED]> 12/06/2004 08:16 AM To: [EMAIL PROTECTED] cc: Subject:[sqlite] Locking and concurrency in SQLite version 3.0 > http://www.sqlite.org/lockingv3.html My thoughts are listed as they come to me. Thought 1: Section 5.0, entitled "Writing to a database file": After the in-memory cache initially spills to disk the exclusive lock must be maintained because the database file is changed. One way to avoid this happening might be to change the database file and log structure as follows: 1) Add a numeric entry to each page in the database file that refers to a specific page (or file offset) into the log file. 2) Add an entry to each log file entry indicating a 0 or 1. If a page is read from the database and has a non-zero file offset, that page refers to the roll-forward log that superceeds it. A zero in the log file entry indicates it is a rollback entry, while a 1 indicates it is a roll-forward entry. The algorithms described would change in the following ways: 1) Instead of writing the dirty page to the main file when memory spilliage occurs, write it to the journal. If the main file entry already has a file offset encoded into it, write the page to that offset. If the main file entry has no offset, write it at the end of the journal file and overwrite only the offset of the main page. 2) Readers with shared locks should always overlook any such offsets it finds in main files. Readers with any of the writer locks should refer to the journal for the updated version of such pages. 3) When rolling back a journal file, only rollback pages with a 0 entry in the rollback/roll-forward field. 4) When committing a transaction write all pages from memory, but also commit any pages in the journal with a 1 in the rollback/roll-forward field. 5) You might have to rethink any vacuum operation and some other small aspects of life. By using the main file as an index into the roll-forward log you make truncating the database file more difficult. One extra alternative to throw in is to keep the roll-back and roll-forward journals in separate files. That would avoid the need to identify the individual log entries as roll-back or roll-forward and may improve performance of large changes. The roll-forward file would never have to be committed. This approach differs slightly from previous suggestions of the shadow pager or of creating tree structures in the journal file. It does not completely virtualise the pager level, although the concept is similar. It requires only trivial extra structure in the journal file since it uses the real main file as an index into the roll-forward section of the journal. If this kind of scheme were to be implimented in the future the groundwork in file format changes could be laid now in a forward-compatable way by allocating the necessary spaces and always ensuring they had a zero value. Thought 2: I'm a little concerned about when SQLITE_BUSY can be returned. In section 7.0, entitled "Transaction Control At The SQL Level" a mention is made of locks not be acquired with the BEGIN statement. Personally I don't like to see SQLITE_BUSY at all. I currently modify my sqlite version to use blocking locks in restricted ways to avoid getting the message and ensure optimum fairness. If they do occur, I would prefer they happen at well-defined and designated places. Hmmm... I guess I can't think of any cases where this is really an issue, though. I would like to see blocking locks supported by SQLITE. If that's not possible it's ok, but my preference is that the capability should exist. Currently sqlite provides an API to execute a function when SQLITE_BUSY would be returned. That's ok, but doesn't suit blocking locks well for two reasons: 1) The locking semantics of sqlite use operating system locks in specific ways that would be unwise to mess with in a callback function. 2) I don't belive there is an API to register a corresponding unlock function to the sqlite_busy_callback, so whatever locks might be put in place can't be unmade at appropriate times. Perhaps the API should be changed to support replacement of the various os.c lock functions for each of the specific lock types in the new sqlite locking model. As a matter of interest, the current sqlite isn't far off being able to work with blocking lock in place of its existing non-blocking locks. The main prohibition that needs to be imposed is that shared locks cannot be upgraded to exclusive locks. The current sqlite can be "tuned" to ensure exclusive locks are obtained early to prevent blocking locks from deadlocking. I haven't seen the new sqlite3 code and haven't seen detail of how the various locking mode transitions will be implimented in a posix environment to know whether extra problems will be introduced in this area. By my reading the allowable transitions will be these: Unlocked -> Shared
[sqlite] create trigger before commit...
G'day, I seem to recall (but cannot find) an email on this mailing list sent by drh quite sometime ago regarding triggers before commits. As I recall, he was battling with some kind of problem for which a trigger that executed immediately before the commit was executed seemed to be the best solution. I'm currently working on/planning/thinking about an accounting system. The system essentially consists of a transaction table, an account table, and a transaction entry table as follows: CREATE TABLE AccountTable (AccountId INTEGER PRIMARY KEY, name); CREATE TABLE TransactionTable (TransactionId INTEGER PRIMARY KEY, date, memo); CREATE TABLE TransactionEntryTable (TransactionId, AccountId, amount); I'm wanting to put some foreign key constraints in, and I think I can do that with triggers, but the tricky constraint of the data model is that sum(TransactionEntryTable.amount) must always equal zero for each TransactionId (this is double-entry bookkeeping). My thinking is that I can check this with something approximating the following pseudo-triggers and table: CREATE TABLE DirtyTransactions (TransactionId UNIQUE); CREATE TRIGGER addDirtyTransaction BEFORE INSERT OR UPDATE OR DELETE ON TransactionEntryTable BEGIN INSERT OR REPLACE INTO DirtyTransactions VALUES(old.TranactionId) INSERT OR REPLACE INTO DirtyTransactions VALUES(new.TranactionId) END; CREATE TRIGGER checkDirtyTransactions BEFORE COMMIT BEGIN SELECT RAISE(Abort) WHERE (SELECT sum(Amount) AS s FROM DirtyTransactions LEFT INNER JOIN TransactionEntryTable WHERE s<>0 GROUP BY TransactionId); ERASE FROM DirtyTransactions; END; So, did the BEFORE COMMIT get implimented? If so, is it stable/usable? I don't see it on the sqlite language page. If not, can anyone think of an alternative way to do this that doesn't require a table-scan of TransactionEntryTable each time a new transaction is added? Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] getpid() and linux threads in sqlite
Hello, "Mrs. Brisby" <[EMAIL PROTECTED]> 03/06/2004 01:37 PM To: Daniel K <[EMAIL PROTECTED]> cc: [EMAIL PROTECTED] Subject:Re: [sqlite] getpid() and linux threads in sqlite > On Wed, 2004-06-02 at 01:53, Daniel K wrote: > > The getpid() call on my Redhat9 linux box is causing > > some trouble for SQLite. It's reasonably complicated, > > but SQLite currently assumes that on linux the > > getpid() function returns a different value in > > different threads. > Anyway, it shouldn't cause any harm. SQLite only uses the pid to help > work around problems with fcntl-locking (esp. when using NFS) and to > seed the random number generator. I believe the software Mr K is working on is sqlite version 3, and that it is file locking issues he's looking into. Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Sqlite Secure backup
David Costa <[EMAIL PROTECTED]> 03/06/2004 09:26 AM To: [EMAIL PROTECTED] cc: Subject:[sqlite] Sqlite Secure backup > I am in the process of writing a PEAR package to handle Sqlite backups > (both local backups and remote backups) and I am looking for some > suggestions. > Is the right way to proceed the db dump ? I assume that copying the db > will result in a corrupted file if the db is used at the time of > backup. Using .dump in the sqlite command-line utility is always going to be the simplest, lowest risk solution. I recommend it, especially for small databases. Using .dump, you don't have to worry about other readers and writers. You don't have to worry about stray -journal files. All of this is taken care of by the engine itself. You end up with a sequence of sql commands that are suitable for rebuilding the sqlite database. This list has a number of advantages over the basic file format: 1) Although sqlite database changes have in the past been forward compatible, it is more likely that the .dump format of an ancient version will be restorable on modern sqlite versions than that the old database files is restorable. 2) .dump output is more likely to be backwards compatible, in case you decide you have to downgrade your sqlite version 3) The dump file is human readible and is more likely to be able to be applied to other database software if the need arises 4) The .dump output is more compressable than the original file format. In gzip form it will usually take up less space than the gzip of the original database file. The only significant downside to the .dump format in my experience is that it takes longer to restore from your media than a straight ungzip would take. this is particularly true if you have large tables with indices. On some (old) hardware I use a 400meg file takes only a few seconds to restore from gzip on backup media, but the equivalent .dump format takes in the order of half an hour to rebuild its tables and indices. If you have large tables and restoration time is critial, the original sqlite files are probably the best thing to backup. If you're more interested in the security of your data, I recommend backing up the .dump output instead. Benjamin - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Re: sqlite-users Digest 22 May 2004 05:23:11 -0000 Issue 115
G'day, Ulrik Petersen <[EMAIL PROTECTED]> 23/05/2004 07:31 AM To: [EMAIL PROTECTED] cc: Subject:[sqlite] Re: sqlite-users Digest 22 May 2004 05:23:11 - Issue 115 > Read especially section 3.2. The EULA is here: > http://msdn.microsoft.com/visualc/vctoolkit2003/eula.aspx > IANAL, but the way I understand it, you can't link against their > libraries and still distribute your code under an Open Source license, > or distribute your binaries under a license that requires that the > software be offered at no charge. My understanding may be flawed, so > read the EULA yourself before deciding whether the toolchain is for you. IANALE, but my reading is a little different. I think that all its saying is that you can't release your software under a license that forces the -microsoft- libraries to be open source. According to www.gnu.org, that shouldn't be a problem: http://www.gnu.org/licenses/gpl-faq.html#WindowsRuntimeAndGPL Anyway, better solutions have already been offered so I'll say no more on this thread. Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] command history
Puneet, Puneet Kishor <[EMAIL PROTECTED]> 03/05/2004 05:01 AM To: SQLite <[EMAIL PROTECTED]> cc: Subject:[sqlite] command history > Is there a way to enable some kind of command history in the SQLite > shell? You know, press the up arrow to get the previous command... I > think it is called READLINE support, no? If you're still having trouble, perhaps you could try starting sqlite in a different way. Instead of this: $ sqlite my.db > SELECT ...; > CREATE ...; try: $ sqlite my.db 'SELECT ...;' $ sqlite my.db 'CREATE ...;' This method doesn't allow you to run transactions across multiple lines, but does give the advantage of immediately conforming to the way your shell does its command history :) I like to do things this way with ksh and vi editing keys, myself. Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[sqlite] Effectiveness of PRAGMA integrity_check;
G'day, I'm trying to write some defensive code that is able to recover from database corruption. The idea is that if a disk fails and a database becomes corrupt it can be detected and synchronised from a backup copy. To this end, I've just been trying to write a function that returns true only when it is sure the database is ok. I use PRAGMA integrity check; and compare the returned string with "ok". When I tried this with a few random database changes, though, I had a hard time trying to get the corruption to trigger. I did the following: CREATE TABLE foo(bar); INSERT INTO foo VALUES("bar"); I then went in with a text editor and started modifying the bar record. I changed "bar" to "car", but the change was not detected. I started modifying characters to the left and right of the "car" string, but still no corruption. I was able to get corruption to be detected when I truncated the file. Can I take it from this behaviour that there isn't any checksum checking going on apart from headers and the BTrees themselves? Will the integrity_check at least guarantee me that I won't at some later stage get an SQLITE_CORRUPT return? Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[sqlite] Clustered indicies Was: [sqlite] A proposal for SQLite version 3.0
Jeff, Jeff Pleimling <[EMAIL PROTECTED]> 08/04/2004 12:42 PM To: [EMAIL PROTECTED] cc: Subject:Re: [sqlite] A proposal for SQLite version 3.0 At 12:08 PM 4/8/2004 +1000, [EMAIL PROTECTED] wrote: > I believe you're thinking of a 'clustered index'. This puts the data > into the order of the index. There can be, of course, only one clustered > index per table. > Since at least some of the data in the table is moved around on every insert, > regular indexs need to take this into account (usually by indirection, rather > then modifying all of the indexes with each insert). Ahh, I didn't think of that. I don't have any other indices on my table, so this wouldn't be a problem for me... but I can see now how it would harm the general case. I guess the simplest implementation would have a proabition on having -any- other indicies on the table. > >If a table could be ordered according to an index, rather than having an > >external index, I think it would significantly improve the time and space > >performance of my databases. I don't know whether my experience would be > >shared by other users. It it were something that could go into 3.0 it > >would at least do me some good. > Clustered indexes can really slow the performance for OLTP (On-Line > Transaction Processing) and other systems where data is added/deleted in a > mixed fashion. Every time a record is inserted, data is possibly moved on > the disk (with page splits causing even more slowdowns). Yes, that's what's happening already in the index whenever I do an insert. My thinking was that maintaining an index-ordered table would be less work overall than maintaining a table with an ordered index. I could be wrong on that, but I'm not sure I see the flaw in my logic. > If your system is entirely historic data, that would be great - but if your > system is inserting meter readings in (near) real-time, you'd probably > get a big performance hit. It's certainly real-time, with the occasional query. > There are many pros and cons. A google search turns up articles (usually for > MS SQL Server) on both side - some people saying 'always' and some 'never'. I can certainly see how the usefulness of this feature could be limited. I guess the problem is primarily related to how the rowid is chosen. If I could choose a rowid that would put the table in the correct order, and renumber as required I might make some ground. I could order by ROWID, then... although there may have to be some kind of code tweakage to allow the where clauses to operate correctly. Hmm... maybe something like this: BEGIN TRANSACTION; SELECT * FROM mytable WHERE -- if the insertion point is after current data: INSERT INTO mytable VALUES (MAX(ROWID) + 10, ) -- else if insertion point is between two values INSERT INTO mytable VALUES (( + )/2, ) -- else rearrange contiguous values UPDATE mytable SET ROWID = ROWID+1 WHERE ROWID >= AND ROWID < INSERT INTO mytable VALUES (, ) END TRANSACTION; Perhaps the changes to sqlite could be as minimal as providing a facility to say: "I promise to keep these rows in an order consistent with this index" so sqlite will use the index in queries. Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] A proposal for SQLite version 3.0
Peoples :) "D. Richard Hipp" <[EMAIL PROTECTED]> 07/04/2004 11:22 PM To: [EMAIL PROTECTED] cc: Subject:[sqlite] A proposal for SQLite version 3.0 > A design proposal for SQLite version 3.0 can be found at: > http://www.sqlite.org/prop2.html > Feedback from the user community is strongly encouraged. Since this is a fairly rare opportunity to make incompatible suggestions that could make it into code, I thought I'd put this left-field one out there: My main use of sqlite is in a database that stores vast quantities of historical data (individual changes on meter readings, that sort of thing). The main table contains data for multiple instruments that each provide their data in time order with their own updates, but out of order with the updates of others: Instrument1,2am,value is 0 Instrument2,1:59am,value is 3 Instrument1,3am,value is 1 Instrument2,3:01am,value is 4 This table is constructed in the order that data comes in, but the queries I want to do are quite different: SELECT * FROM thetable WHERE instrument="Instrument1" AND time >= 2am and time <= 3am ORDER BY TIME; Well, I have an index to make this work efficiently (the exact query and index details probably aren't that important here). The problem is that the index contributes significantly to the size of the database and the fact that I keep the original ordering around but unused seems like a waste of effort. Queries are also slower than they would be if they were following an integer primary key along a real table, with the extra O(log(n)) lookup for each result entry. A little while ago a list reader suggested a kind of index (from ms access, if I recall... I don't recall the term they used) that is not external. Instead the index changes the order in which the table itself is organised. You suggested at the time that if explicit ordering were required the user could use a CREATE TABLE AS SELECT statement to get things in that order, but that sqlite would have to table-scan as it would have no saved knowledge of the table ordering. Moreover the table would not remain ordered as new elements were ordered. If a table could be ordered according to an index, rather than having an external index, I think it would significantly improve the time and space performance of my databases. I don't know whether my experience would be shared by other users. It it were something that could go into 3.0 it would at least do me some good. The other thing that I think will help me most is the native storage of numeric formats. It's great to see this happening :) On the wchar_t subject, I've just looked up http://docs.sun.com/db/doc/806-0477/6j9r2e2bp?a=view, which says wchar_t is a long under 32-bit solaris and int under 64-bit solaris. Both numbers are 32-bits long. According to http://en.wikipedia.org/wiki/UTF-32, a 32-bit representation is the only unicode that is actually fixed-width. Both UTF-8 and UTF-16 are multi-byte, rather than wide characters. This page also lists various known unicode encodings, so may be of some value. Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Concurrency Proposal
G'day, "D. Richard Hipp" <[EMAIL PROTECTED]> 31/03/2004 01:21 PM To: [EMAIL PROTECTED] cc: Subject:Re: [sqlite] Concurrency Proposal > I think the gist of Ben's proposal is as follows > (please correct me if I am wrong): > Writes do not modify the main database file until > they are ready to commit - meaning that reader can > continue to read from the file during lengthy > transactions. The journal contains modified pages, > not the original pages, and is thus a roll-forward > journal rather than a roll-back journal. I think it's worth my posting a suggestion from a co-worker of mine who may be known to some if only by his surname :) He posed the obvious question as to why the transaction is so long in the first place. My personal answer was that I have a transaction I keep open for a second at a time, in case more change come through. That way I get maximum throughput while retaining the consistency guarantee that journaling provides. His alternative proposal for my situation is simple: Buffer the changes, instead of holding a transaction open. This is something my code could do fairly easily and I'm a bit disappointed I didn't think of it :) If I ever get around to changing the code, I'll have it keep a fixed-size buffer of changes. Whenever the buffer fills, or one second passes since the first buffer entry was inserted, I'll flush the buffer with a transaction. Maybe long write transactions concurrent with readers is a requirement for sqlite, but I'm not so sure it's my requirement anymore. Perhaps this simple suggestion will make it a requirement for fewer current sqlite mailing list users, too ;) Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Concurrency Proposal
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 31/03/2004 02:38 PM - Ben Carlyle 31/03/2004 02:34 PM To: "D. Richard Hipp" <[EMAIL PROTECTED]>@CORP cc: Subject:Re: [sqlite] Concurrency Proposal Peoples, "D. Richard Hipp" <[EMAIL PROTECTED]> 31/03/2004 01:21 PM To: [EMAIL PROTECTED] cc: Subject:Re: [sqlite] Concurrency Proposal > I think the gist of Ben's proposal is as follows > (please correct me if I am wrong): > Writes do not modify the main database file until > they are ready to commit - meaning that reader can > continue to read from the file during lengthy > transactions. The journal contains modified pages, > not the original pages, and is thus a roll-forward > journal rather than a roll-back journal. That's pretty accurate. If the modified pages could all be kept in memory the pager could contain either the new or the old pages. It wouldn't really matter, but if pages need to be flushed from memory they would have to be put into the journal. > The main difficulty in this approach is locating > a particular page in the journal file when it needs > to be reloaded. For example, suppose page X is read > from the main database, modified, then written into > the journal. Due to cache pressure, that page is then > flushed from the in-memory cache. Later on, we need > to read page X again. How do we locate the position > in the journal where the current data for page X was > written so that we can reread it? I'm not surprised this has come up as a problem. I wasn't sure whether sqlite flushed changed pages back to the original database file. Since the whole algorithm hinges off the main file being changed any overflow would have to be written to the journal. Hmm... an on-disk hash or maybe ordered pages? If this were to proceed, ordered pages might be the simplest approach but wouldn't perform well. The algorithm would simply be to use binary searches over the journal file's known-sized blocks to find the pages. This would also have to be done to find an insert point for new pages and then everything afterwards would have to be shifted to accommodate. Hashes would be able to place things explicitly, but you'd have empty pages (of undefined content) throughout the journal that you'd have to explicitly write "this is not a page" headers into. Hmm. Obviously the poor performance of ordered pages would only become a factor if your in-memory cache were exceeded so it could be optimised... but there comes a point where you have a few gig of database and a corresponding couple-o gig of index that all need to be updated every time you do an insert. Large databases lead to large changes, at least every once in a while. Doug's shadow pager idea looks like it would effectively be an optimised form of the ordered pages concept[1]. If the journal had an on-disk linked list of ordered page numbers and their in-journal indicies searching and data modification would occur mostly in-memory, and would could be done with constant memory-usage: Pager file :- ... A BTree with the corresponding navel-gazing might be better, but consider the following search algorithm: Read first index block While the greatest page in this block is less than the target page, skip to the next index block. Do binary search in memory Anyway :) It's just some ideas that fell out of my head. It may or may not be practical. Benjamin. [1] I haven't read the proposal in detail, so I apologise if I'm making misrepresentations - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[sqlite] Concurrency Proposal
G'day, I've been mulling over this for a little while, but only got around to writing anything down last night. I may have made some errors, perhaps serious ones. Either way, I thought I'd just put it out there. This email contains an algorithm for permitting concurrent reader and writer access to an sqlite database. The algorithm has the following characteristics: - File-based locking (rather than page or row-level locking) is retained - Semantics of current operations remains the same, as far as I can tell - A change to the sqlite file format would be required - The solution is (probably) not backwards-compatible - Multiple readers can access the database at the same time - A single writer can prepare changes to the database while readers are still reading - The writer locks out all readers and other writers while it performs a commit operation - Blocking locks would be supported (and may be required in some places) - Lock periods are short, even when transactions are held open for a long time. This may be blocking locks are more appropriate than they were previously. A process that holds a database open for a second at a time to improve throughput will only block readers on the second boundary, although it will still block other writers for the full second duration. - Inconsistency only occurs if a writer fails to apply its transaction to the main at commit time, not if it fails while the transaction is still open. - More synching may be proposed than in the current algorithm, but I'm pretty picky about synchs - The journal file is used for writer locking contention, so can't be erased at the end of each transaction. It can be erased safely only when there are no writers currently waiting. - The algorithm doesn't allow modification of main file (apart from journal-based recovery) until the commit phase of the writer algorithm. Reader algorithm: open main file read-lock main file Check for inconsistency marker in main file If inconsistent release main file read-lock[1a] open journal file write-lock journal file write-lock main file[1b] perform journal playback routine[2] sync main file clear inconsistency marker in main file sync main file truncate journal file release journal file write-lock downgrade main file write-lock to read-lock [3] perform queries flush block cache release main file read-lock Notes: [1] Don't upgrade read lock to write lock, otherwise deadlock may occur for blocking lock. Never attempt to lock the journal file while you have a lock on the main file. [2] May be a no-op as a process who got the write-lock first may have already performed the playback [3] Consistency is now guaranteed, and a valid read-lock exists on the main file Writer algorithm[1]: open journal file write-lock journal file write-lock main file[2a] Check for inconsistency marker in main file If inconsistent perform journal playback routine sync main file clear inconsistency marker in main file sync main file downgrade main file write-lock to read-lock[2b] truncate journal [3] perform updates in-memory and in-journal only. Do not modify main file. Commit: sync journal file upgrade main file read-lock to write-lock write inconsistency marker into main file[4] sync main file write changes into main file sync main file clear inconsistency marker for main file sync main file[5] truncate journal file[6] Notes: [1] To avoid deadlock, a reader should never attempt to "become" a writer. The reader must first drop its read lock (with associated cache flush), and then begin the Writer algorithm steps. [2] Write-lock main instead of read-lock as a precaution. If we read-lock it, readers could open the database concurrently, see the same inconsistency as we do, and get caught behind our transaction as writers trying to resolve it. With a write-lock up-front until we're sure no inconsistency exists we get to avoid this degenerate case. This ends up meaning a writer briefly locks the main file twice: once at the beginning of the transaction, and once at the end. [3] At this point the writer has a write lock on the journal, a read lock on the database, and the journal is empty. No inconsistency exists in the database. [4] The use of an inconsistency marker would be the main incompatability between proposed and existing sqlite implementations. I see this most probably as an inconspicuous boolean that could sneak into an sqlite database header. [5] Why so many syncs? Well, to be safe. We want to know all the data we might roll back in to the main is written to disk before we record the fact that that roll-back has to occur. We want to know that the inconsistency is noted before we start modifying the database, otherwise the modifications could be written but not the inconsistency record. We want to know the the data has been fully updated before we clear the
Re: [sqlite] Search results
Greg, "Greg Obleshchuk" <[EMAIL PROTECTED]> 26/03/2004 03:29 PM To: <[EMAIL PROTECTED]> cc: Subject:[sqlite] Search results > In the database I will have three columns where I will be searching for matches , my question is what type of method do people/systems use to rate a match in the database? It's difficult to rate exact matches, because all search results returned exactly what was requested. As such, the only way to find a match that is better than other matches is to have the user refine their search. Rated search results are more useful when matches are inexact. A word that is spelled nearly the same as the one you searched for. A result that only contained four out of your five queried words. They can be rated with respect to each other because they match the query to a better or worse extent. For exact matches you really have to come up with another method of deciding how relevant the results were. Google uses information on how popular a web-page is to determine which pages are most likely to be relevant when users search. If you could get feedback on how useful particular search results were, you could make sure the best results were returned for later searches. If you could predict how useful results are likely to be when you enter the records, you could provide a baseline ranking to start from. Anyway, tricky subject :) No quick answers ;) Benjamin - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Text(3)
"Anabell Chan" <[EMAIL PROTECTED]> 24/03/2004 05:23 PM Please respond to anabell To: <[EMAIL PROTECTED]> cc: Subject:[sqlite] Text(3) > Firstly, how do I define a string table field with constraints on its size? > For example Text(3), string of three characters. Secondly, how is it > enforced during updates? Many thanks! Sqlite will make you work a little to enforce contraints like this. If you really want to, here's how: CREATE TABLE foo (a); -- we want a to be at most three characters long CREATE TRIGGER foo_a_insert_constraint BEFORE INSERT ON foo BEGIN SELECT RAISE(ABORT, "a is too long") WHERE length(NEW.a) > 3; END; CREATE TRIGGER foo_a_update_constraint BEFORE UPDATE OF a ON foo BEGIN SELECT RAISE(ABORT, "a is too long") WHERE length(NEW.a) > 3; END; sqlite> INSERT INTO foo VALUES("abcd"); SQL error: a is too long sqlite> INSERT INTO foo VALUES("abc"); sqlite> Adjust table and column names, and string length to suit. See http://sqlite.org/lang.html#createtrigger for reference material. This approach can be taken to check a wide variety of constraints. If you can define a WHERE clause that can detect a problem, you can abort the offending operation using a couple of triggers. Does anyone have a more succinct version of the above? Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] sqlite not sharing page cache across processes?
Mark, "Mark D. Anderson" <[EMAIL PROTECTED]> 22/03/2004 05:15 AM To: [EMAIL PROTECTED] cc: Subject:[sqlite] sqlite not sharing page cache across processes? > It appears from looking at pager.c that every process working > with a database acquires its own page cache by use of read(). > They independently keep their individual cache in sync by > playing back the (single) journal. ... > But is my understanding of the tradeoff correct? My reading of the code has the readers flushing their cache often, and not using the journal at all to maintain consistency. My reading has the reader start with nothing in the cache, lock the database with a read lock, perform as many SQL queries as are in the single text string it's currently parsing, and flush its cache around the time it unlocks the database. The cache remains valid only as long as the lock is in-place, and the operating-system is responsible for any caching between locks. Replay of the journal only occurs if a writer has terminated (lost its lock without committing or rolling-back its transaction). The next reader will roll-back the transaction by restoring original pages from the journal. Benjamin - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [sqlite] Resetting primary key auto-increment after table re-create
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 09/03/2004 10:56 AM - Ben Carlyle 09/03/2004 10:54 AM To: "Roger Hawkins" <[EMAIL PROTECTED]>@CORP cc: Subject:RE: [sqlite] Resetting primary key auto-increment after table re-create Roger, "Roger Hawkins" <[EMAIL PROTECTED]> 08/03/2004 03:18 PM To: <[EMAIL PROTECTED]> cc: Subject:RE: [sqlite] Resetting primary key auto-increment after table re-create > Thanks for your reply... > I havent found any reference to this in the doco so Im not sure how any > of this might work.. http://sqlite.org/faq.html#q1 Pay particular attention to the "is logically equivalent" part. > So what you are saying is that if I recreate the table I can insert all > the old values back into the newly created table (including the primary > key values) and the primary key column wont complain? That's a bit > scary! So long as you don't insert the same number into the INTEGER PRIMARY KEY twice, sqlite will not complain. It's a primarily a primary key. It only doubles as an auto-increment field when you insert NULLs into it. > Even if this is true what happens when I next insert a value (normally > without specifying the primary key field) - does it just pick up the > latest? > Any one tried this? Yes. You can, too: CREATE TABLE foo(bar INTEGER PRIMARY KEY); INSERT INTO foo VALUES(NULL); INSERT INTO foo VALUES(2); INSERT INTO foo VALUES(NULL); INSERT INTO foo VALUES(3); -- SQL Error: PRIMARY KEY must be unique SELECT * FROM foo: 1 2 3 Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Delete a database file
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 27/02/2004 09:54 AM - Ben Carlyle 27/02/2004 09:34 AM To: "Benoit Gantaume" <[EMAIL PROTECTED]>@CORP cc: Subject:Re: [sqlite] Delete a database file Benoit, "Benoit Gantaume" <[EMAIL PROTECTED]> 26/02/2004 08:45 PM To: <[EMAIL PROTECTED]> cc: Subject:[sqlite] Delete a database file > I just want to suppress a database file. > But it's impossible to do it from the procees that created or opened the database even if i closed it! > I guess that I do not close properly the database. > The only way to be able to remove the file is to close the process that used the database... > Do you have some tip? Under unix, just unlink it. The file will actually be deleted when your process closes it. Under Windows, you can't delete any open file of any kind, ever. You have to close the file, first. If you issue sqlite_close calls for each sqlite* pointer you recieved from sqlite_open the file will be closed. But the question is, why do you want to close the file? Just to free up some space? Why are you using an on-disk database when you could use an in-memory database? Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] multithreading
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 27/02/2004 09:53 AM - Ben Carlyle 27/02/2004 09:49 AM To: [EMAIL PROTECTED]@CORP cc: Subject:Re: [sqlite] multithreading Dave, [EMAIL PROTECTED] 26/02/2004 04:56 PM To: [EMAIL PROTECTED] cc: Subject:[sqlite] multithreading > Since I'm > debugging, I'm crashing a lot before I have a chance to sqlite_close() > the database. Is that the problem? If it is, is there any way to > unwedge a database if the app crashes before closing it? sqlite uses operating-system locks. When your process terminates, the operating system will remove all locks that process owned. I presume from the fact you can delete the file while it's still locked (and therefore open) that you're running a unix variant? I would be checking the output of lsof to see exactly which process still had the file open. > I'm trying to access the database from two threads: the UI thread > mostly reads and occasionally updates rows, while the network thread > loads new messages and inserts them. Will that work, or do I just > misunderstand how SQLite does multithreading? It is important that each thread has its own sqlite*, each coming from a separate sqlite_open call. If that condition is met, nothing should actually misbehave. Be warned, though, that each thread will lock out all other threads while it is querying or modifying the database. If any thread locks the database for too long, the other threads attempting to access the database will have SQLITE_BUSY returned. It is possible to hack blocking locks into sqlite to make the waiting thread sleep, instead, but I'll leave that as an exercise for the reader (or drh). Benjamin - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [sqlite] multithreading
"rumcais" <[EMAIL PROTECTED]> 26/02/2004 05:16 PM To: <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]> cc: Subject:RE: [sqlite] multithreading > I've make some experiments around accessing the SQLite database fom concurent processes at the same time > and I find it work good when I'm using transactions every time I touch the database. > When I don't, it corrupt the database file sometimes. In sqlite, INSERT INTO foo VALUES("bar"); INSERT INTO foo VALUES("baz"); is equivalent to BEGIN TRANSACTION; INSERT INTO foo VALUES("bar"); END TRANSACTION; BEGIN TRANSACTION; INSERT INTO foo VALUES("baz"); END TRANSACTION; If you are getting database corruption then you are running into a very strange and very serious bug, or your code (not sqlite) is corrupting the database (in which case it is still a very serious bug, just not in sqlite :). I suggest you reinvestigate and report. If its reproducable it should be possible to track down. Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Primary key and index
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 09/02/2004 12:04 PM - Ben Carlyle 09/02/2004 12:04 PM To: Bertrand Mansion <[EMAIL PROTECTED]>@CORP cc: Subject:Re: [sqlite] Primary key and index Bertrand, Bertrand Mansion <[EMAIL PROTECTED]> 09/02/2004 04:29 AM To: <[EMAIL PROTECTED]> cc: Subject:[sqlite] Primary key and index > Does the declaration of an INTEGER PRIMARY KEY implies the creation of an > index ? I am asking because I have noticed a performance boost when I create > an index on an INTEGER PRIMARY KEY when sorting rows with the primary key > column. Yes, but not in the way you're thinking. > More generally, does declaring any column primary key implies that this > column will be indexed ? I am not sure about that because when I do a PRAGMA > index_list(), there is no index shown for the primary key. Yes, although an INTEGER PRIMARY KEY is a special case of this. > Thanks for any hints, >From http://www.sqlite.org/lang.html#createtable: "Specifying a PRIMARY KEY normally just creates a UNIQUE index on the primary key. However, if primary key is on a single column that has datatype INTEGER, then that column is used internally as the actual key of the B-Tree for the table. This means that the column may only hold unique integer values. (Except for this one case, SQLite ignores the datatype specification of columns and allows any kind of data to be put in a column regardless of its declared datatype.) If a table does not have an INTEGER PRIMARY KEY column, then the B-Tree key will be a automatically generated integer. The B-Tree key for a row can always be accessed using one of the special names "ROWID", "OID", or "_ROWID_". This is true regardless of whether or not there is an INTEGER PRIMARY KEY." i.e.: 1 Table = 1 BTree, the BTree holds the data and is ordered by ROWID 1 Table with 1 Index = 2 BTrees, the second referring to rows in the first 1 Table with PRIMARY KEY = 1 Table with 1 (unique) Index 1 Table with INTEGER PRIMARY KEY = 1 Table, with its own BTree forming its unique index Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Converting Sqlite to Myqsl
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 22/01/2004 11:13 AM - Ben Carlyle 22/01/2004 11:06 AM To: Martin Brinkmann <[EMAIL PROTECTED]>@CORP cc: Subject:Re: [sqlite] Converting Sqlite to Myqsl Martin, Martin Brinkmann <[EMAIL PROTECTED]> 22/01/2004 10:02 AM To: [EMAIL PROTECTED] cc: Subject:[sqlite] Converting Sqlite to Myqsl > I tried to find a easy way of converting the file to mysql format, i want > to use > the data for a website of mine. Until now, i did not find a way to do it > and > now i would like to ask you if you could provide me with an easy to > understand > solution to this problem i face. Have a look at the dump command on this web page: http://sqlite.org/sqlite.html The simple answer is sqlite filename.idb .dump > sqlcommands.txt sqlcommands.txt will contain the SQL commands required to re-create filename.idb using sqlite. You may need to edit the sqlcommands.txt to make sure the table columns have the correct types for mysql import. Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Clarification on file locking in web-served apps
Vania, Vania Smrkovski <[EMAIL PROTECTED]> 15/01/2004 12:14 PM Please respond to vania To: "SQLite-Users (E-mail)" <[EMAIL PROTECTED]> cc: Subject:[sqlite] Clarification on file locking in web-served apps > Why the heck would anyone even need Transactions in the first place? If the DB file is locked, no changes can be made, right? The database file is only locked while a transaction is active. Since a transaction is implicitly started and ended every time you do an update if one's not already running this means that whenever you're modifying the data the whole file is locked. When your transaction ends the file is unlocked again. Likewise, when you query the database the file is read-locked for the duration of the query. In this instance multiple programs (or threads) can query the database at the same time, but if any query is active updates have to wait. sqlite_open does not lock the database. You can keep the database open as long as you like, and whenever you're not actually executing SQL the file will be unlocked. Note that the other thing transactions will give you is a guarantee of atomicity. Either the whole transaction gets written or none of it does. If your program crashes (or your machine loses power) before the end of the transaction, the partial updates will be reversed next time you open the database. > So I'm wondering if I am missing a few pieces Is the database file locked more than once during a process? That is, does it get locked as the Update statement is getting a collection of rows with which to apply the intended actions? Does it then release the lock as it prepares the update on this subset of data, and then re-lock when it prepares to write? If so, that would explain the need for a Transacction, as it leaves gaps of access during the transaction. If you do these as separate SQL statements without a transaction its possible that someone else can get a write in, in-between your own query and update. > Ignoring Transactions for a second, if I have such a Select, will every user ben locked behind a wall until the Select for user 1 is complete? > And if this is not the case for Select, will it be so for Update/Insert of this lenth? Other selects can operate concurrently, but updates will have to wait until all selects have finished. Benjamin - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] locked - what am I doing wrong?
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 09/01/2004 10:44 AM - Ben Carlyle 09/01/2004 09:35 AM To: "David Swigger" <[EMAIL PROTECTED]>@CORP cc: Subject:Re: [sqlite] locked - what am I doing wrong? G'day, "David Swigger" <[EMAIL PROTECTED]> 09/01/2004 12:04 AM To: <[EMAIL PROTECTED]> cc: Subject:[sqlite] locked - what am I doing wrong? > I am having a bit of a hard time understanding something: > When I am in the middle of stepping (sqlite_step) through the > results of a table query, the database is locked. Is this normal? Have > I missed something? This is discussed in Question 7 of the FAQ, but I think the first paragraph is a little misleading. I think that: "Multiple processes can have the same database open at the same time. Multiple processes can be doing a select at the same time. But only one process can be making changes to the database at once." should read: "Multiple processes can have the same database open at the same time. Multiple processes can be doing a select at the same time. But a process can't make changes to the database while any readers or other writers are active." The important thing to understand here, is that you're not stepping through a set of results that you have in memory and that were extracted from the database. You are (at least for simple queries) stepping through data that is still in the database. Because you're still actively reading the database the writer thread can't get access until your query is finished. If, instead, you want to load the whole results set into memory before stepping through them consider the sqlite_get_table_* functions. Alternatively you can collect the results into memory any way you like before stepping through the structures you've created yourself. Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [sqlite] Documentation
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 06/01/2004 08:34 AM - Ben Carlyle 06/01/2004 08:31 AM To: "Allan Edwards" <[EMAIL PROTECTED]>@CORP cc: Subject:RE: [sqlite] Documentation Hello, "Allan Edwards" <[EMAIL PROTECTED]> 06/01/2004 12:18 AM To: "'Roger Reghin'" <[EMAIL PROTECTED]>, "'Ralph Wetzel'" <[EMAIL PROTECTED]> cc: <[EMAIL PROTECTED]> Subject:RE: [sqlite] Documentation > CREATE TABLE Allan (DateTimeField DateTime); > INSERT INTO Allan (DateTimeField) VALUES ('2004-01-04 12:00:00'); > SELECT DateTimeField FROM Allan WHERE datetime(DateTimeField) = > datetime('2004-01-04 12:00:00'); > SELECT DateTimeField FROM Allan WHERE datetime(DateTimeField) > > datetime('2004-01-03 12:00:00'); > SELECT DateTimeField FROM Allan WHERE datetime(DateTimeField) < > datetime('2004-01-03 12:00:00'); Part of the appeal of this particular time and date format is that for normal equality and comparison tests like the ones you've done you don't need to call any datetime funcations. String comparison is fine (at least until the year 1 needs to be represented) because the digits are always in exactly the same places in the string (which is why it's important to still include the leading 0 chars): SELECT DateTimeField FROM Allan WHERE DateTimeField = '2004-01-04 12:00:00'; SELECT DateTimeField FROM Allan WHERE DateTimeField > '2004-01-03 12:00:00'; SELECT DateTimeField FROM Allan WHERE DateTimeField < '2004-01-03 12:00:00'; By dropping the transformation you permit the current version of sqlite to use any relevant index to do these selections. I don't know the ANSI standard to which you referred, but XSD defines the datetime with a few extra tweaks: A T between the date and time, optional sub-second resolution, and an optional time-zone: '2004-01-04T12:00:00+10:00'. Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Concurrency in SQLite
G'day, "D. Richard Hipp" <[EMAIL PROTECTED]> 24/11/2003 03:22 AM To: [EMAIL PROTECTED] cc: Subject:[sqlite] Concurrency in SQLite > Please, give me some examples of the kinds of things you are > doing which could benefit from improved concurrency. > * Are you holding transactions open for an extended period > of time? Why? This is my situation. I have a large amount of data flowing into a database which shows historical records (maybe a couple of thousand inserts per second). Queries are much rarer. To keep inserts efficient I hold transactions open for one second at a time. The most important change for me is one that I introduced into my copy: Blocking locks. These are important because there is only an instant between the last transaction closing and the next beginning. In this scenareo the poll-based locking mechnism currently used by sqlite is just not lucky enough to try at that instant. Only blocking locks with their operating-system support are sufficient to ensure that the readers get in at all. I also have a situation where I have multiple writers on the database that can run into the same problem. If you could ensure that readers could still read the untouched version of database blocks while a writer is working on "dirty" version of the same blocks I wouldn't have any problems as far as reading is going. Writing would still be problem, though. It's not the amount of concurrency that's a problem for me. One at a time is fine. It's just the ability to schedule the accesses that do happen very tightly together that I care about. >* How many processes do you have trying to access the database > at once? Usually at most two or three. > * How do you currently handle SQLITE_BUSY replies? Do you use > the sqlite_busy_handler() or sqlite_busy_timeout() APIs? The problem with both of these apis is that they use timers beetween attempts. If I could put a blocking lock on the database in the busy handler, allow the database access to occur, then get called back to unlock the database, it would be almost as good as the current blocking lock situation. > * How large are your databases? Usually less than a gig :) > * Do you ever put database files on a shared filesystem? No. Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] database table is locked
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 07/11/2003 10:03 AM - Ben Carlyle 07/11/2003 10:03 AM To: Thiago Mello <[EMAIL PROTECTED]>@CORP cc: Subject:Re: [sqlite] database table is locked Thiago Mello <[EMAIL PROTECTED]> 08/11/2003 09:00 AM To: [EMAIL PROTECTED] cc: Subject:[sqlite] database table is locked > Im doing a SELECT sql query, and in the callback function of this sql > query I do a UPDATE, so when I do this update I get > database table is locked. > How I cant do the UPDATE in the second sqlite_exec() function?! Either separate the two operations or bring them closer together. To separate them do something like: BEGIN TRANSACTION; SELECT ...; -- Collate results -- Action results: UPDATE ...; UPDATE ...; ... UPDATE ...; END TRANSACTION; To bring them closer together do something like this: UPDATE foo = bar WHERE (SELECT ...) It depends on what your query is and how you want to update the table in response to your table data. Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Performance problem
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 07/11/2003 10:00 AM - Ben Carlyle 07/11/2003 10:00 AM To: "Mrs. Brisby" <[EMAIL PROTECTED]>@CORP cc: Subject:Re: [sqlite] Performance problem "Mrs. Brisby" <[EMAIL PROTECTED]> 07/11/2003 12:08 AM To: "Jonas Forsman / Axier.SE" <[EMAIL PROTECTED]> cc: "Clark, Chris" <[EMAIL PROTECTED]>, "D. Richard Hipp" <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject:Re: [sqlite] Performance problem > On Wed, 2003-11-05 at 23:59, Jonas Forsman / Axier.SE wrote: > > Note: Testing has shown PostgreSQL's hash indexes to be similar or slower > > than B-tree indexes, and the index size and build time for hash indexes is > > much worse. Hash indexes also suffer poor performance under high > > concurrency. For these reasons, hash index use is discouraged. > Please note I'm note I'm not talking about a hash of the entire key- I'm > talking about n distinct b-trees that are selected by an 8->n bit > function. This transformation can be made very fast: We get a speed > improvement here on searches if our 8->n bit function takes less time > than n-1 random memcmp()'s. How would you handle the lack of ordering associate with hash tables? Sqlite can currently use indicies for three main tests: equals, less than, and greater than. While hash-tables are good at finding equal-to in constant time it usually means linear time (a table-scan) to test for less than or greater than. Do you have a solution to this problem? Benjamin. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Re[2]: [sqlite] Performance problem
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 04/11/2003 02:26 PM - Ben Carlyle 04/11/2003 02:01 PM To: Doug Currie <[EMAIL PROTECTED]> cc: Subject:Re: Re[2]: [sqlite] Performance problem Doug Currie <[EMAIL PROTECTED]> 03/11/2003 05:39 AM Please respond to Doug Currie To: [EMAIL PROTECTED] cc: Subject:Re[2]: [sqlite] Performance problem > > Can any readers suggest ways that I have not thought of for making > > large numbers of non-localized INSERTs go faster? > Instead of journaling the old ("undo") pages of idx2, simply record > the fact that idx2 is modified (i.e., make a single record in the > journal). Upon recovery, noticing that idx2 is modified will cause it > to be recreated from the rolled-back data. I've been thinking about his for a while and I think the solution, as with many other potential sqlite performance issues is to solve it in the operating system, not sqlite it's self. If you don't have enough bandwidth to your file, raid it. If you want to be able to write a large number of transactions per second using journalling that lazily writes to the database enable data-journalling on your filesystem. This is a couter-intuitve approach. When you apply data journalling on a filesystem such as linux's ext3 the operating system writes all data to a journal before writing it to the appropriate locations on disk. You may think that because its being written twice (four times in the case of sqlite, because the journal will be written twice and the file will be written twice) it will take longer to work. When you use fsync() alot the opposite is actually true. You get the behaviour that other list members have been discussing with the combined undo/redo log. The data updates are written sequentially by the operating system to disk so seek-time doesn't come into it, and the fsync() simply ensures that your file has been written to the sequential log. The operating system it's self deals with the lazy writing to the actual disk, and it's transparent to sqlite because the operating system cache makes it appear that its already been written. This only works when you use all-data journalling, not just meta-data like directory entries, and regular writes to the same filesystem might be a little slower... but try it. You should get better performance with data-journalling and sqlite. As this is a problem that can be solved very nicely indeed in the operating system I think that sqlite should make no special provisions for it. I do suggest that the relevant FAQs should contain the instruction to use a data-journalled fs if a large number of transactions must be pushed through sqlite. Benjamin.
Re: [sqlite] Journalling
Hello, v t <[EMAIL PROTECTED]> 27/10/2003 05:16 PM To: "Mrs. Brisby" <[EMAIL PROTECTED]> cc: sqlite <[EMAIL PROTECTED]> Subject:Re: [sqlite] Journalling > I am trying to use sqlite in a context where I will be using it to store some configuration about a system. I want to try to minimize the disk access. Since journalling uses a file on the disk, I wanted to turn it off. I am not worried about rolling back the database to a known state in case of some failure. You're not worried about your database becoming corrupted and all your data being destroyed? It doesn't sound like you like your data very much... ;) This is a question that pops up on the list every so often, and there have been some good reasons for it. Well. One comes to mind, and that's the use of flash memory in embedded devices. When you don't want to write to your media too many times you might find that it's better to turn off journalling and risk the consequences... perhaps make regular backups... rather than write to the media too often. The problem is that most people don't know what they're talking about when they ask how to turn journalling off. They don't understand when the journal gets written in the first place and they don't understand which operations they're performing that aren't affected by journalling. They haven't read the list archives, and they patently haven't read the manual, because it's listed under the pragma section of http://www.sqlite.org/lang.html. This is why when you ask the question on this list you get the response "Well I know you've asked how to turn off journalling, but what do you actually want to achieve by this and what alternatives have you considered?" You haven't yet given an explination that makes sense to me, so in the spirit of RTFM I'll leave you to find the exact manual reference yourself. I think it's worth you understanding, though, that journalling doesn't occur when you're only querying the database. It only happens when you modify the database. Using transactions while modifying the database is not only a good idea for data integrity, it also makes the overhead associated with synching the file to disk almost disappear so there's usually no need at all to turn off journalling. Given all of this, if you still can't find the exact spot in the manuals to turn this off yourself perhaps you could offer a more complete discussion about the nature of your database and your access to it. You'd be well advised to discuss the alternatives you have considered so that the gentle list members will feel more compelled to answer your question directly. Benjamin --Premature optimisation is the root of all evil
Re: [sqlite] Problem with LIMIT and transactions? Or is it just me
Howdy, "Jay Macaulay" <[EMAIL PROTECTED]> 23/10/2003 05:55 AM To: <[EMAIL PROTECTED]> cc: Subject:[sqlite] Problem with LIMIT and transactions? Or is it just me > I am running into a real odd problem that I hope someone can explain. I > have a simple query: > BEGIN; > SELECT * FROM users WHERE uid >= 1 LIMIT 10; > COMMIT; > Simple enough. No reason to use a transaction, but I use them for another > reason, actually. The problem I run into is: If I execute the above > statement 3 times, it works fine. It takes 70 ms to execute the select > statement. On the fourth time of execution the select statement takes over > 5 seconds to execute!! I can execute the statement again after the 5 > seconds, and it's back to a 70 ms execution. Then after other time I > execute this statement it takes 5 seconds. This does seem difficult to explain. Is it possible that another process or thread is accessing the database occasionally (thus locking this "writer" out for a period?). Apart from that, you probably have to work through a process of elimination. Does it still happen without the LIMIT clause? Does it still happen if you take the transaction away? Is there anything else going on on the machine at the time? How are you running your test? How are you obtaining your timings? What's the schema of the database? Is there an index? Hopefully the answers to some of these questions will bring you closer to a conclusion. Benjamin.