Thanks again for the responses, very helpful.

Taking into account that I can't just read the database file in my program without some sort of locking as was mentioned and explained clearly by multiple people (thank you!), I tried two different implementations to see which would perform better.

The writing mechanism remained the same in either case. For it I simply did (leaving out parameters to simplify):

sqlite3_open_v2();
sqlite3_exec("BEGIN TRANSACTION");
for(int i=0; i<sqlList.size(); i++)
  sqlite3_exec(sqlList[i]);
sqlite3_exec("COMMIT");
sqlite3_close();

Once again keep in mind that there are multiple databases to update, so I can't (shouldn't?) just keep them all open, even though I know there are costs to opening and closing the db.

For the readers, I tried using the sample implementation of the backup api. But it turned out to be quite slow for my application. So I read somewhere that it's ok to read the file in, as long as you lock it with a "BEGIN IMMEDIATE". So I did:

sqlite3_open_v2("sqlite3.database.filename");
sqlite3_exec("BEGIN IMMEDIATE");
file.open("sqlite3.database.filename");
file.readAll();
file.close();
sqlite3_exec("ROLLBACK");
sqlite3_close();

So does this look like I should always read/write the database properly? Would this approach even work multi-threaded (as long as sqlite was compiled/configured properly)? I know it's not the preferred way of doing things, but actually seems to work quite well in my case. Obviously, I know I can't use a WAL database like this, at least not without reading the other (-shm) database files as well. I added a busy timeout of several seconds to ensure if someone was reading/writing, it won't just fail (but in the real code also check for some other errors, etc.). I also run the database "PRAGMA synchronous = OFF" which increases the speed at least 10x (I'm not too worried about OS crashes/server failure since I'll have backups and it's not critical data).

Does this now look good? Thoughts? Thanks!

Josh


I don't have time right at this minute to carefully read and respond to all the responses to me, but I have to at least at this point say a heartfelt *thank you* to all those who have responded to me. I'm blown away by how detailed and helpful and patient all the responses are. When I have more time in a couple days I'll be reading each response carefully, and I'm sure I'll have a followup question or two.

Josh

This is gonna be a long one, but I added (hopefully) some humor to make it
a bit better of a read.

@OP
Just so I'm clear, you're pushing SQL statements from clients to your
server, but then pushing raw data from the server to the client?  Myself,
I'd never push a "Work In Progress" database, or a database that has ANY
file handle open against it, to another machine.  If you're cracking open
the raw SQLite file and reading segments of raw data from it and expecting
that data to exist on a client.. oh boy...  You're in for a world of hurt.

If you're set on sending the raw SQLite database, as suggested, I'd use the
backup API for that database, then present the client with the backed up
copy when then backup is complete.

If, however, pushing 20gb over a traffic congested, 16Mbyte/sec *token
ring*network for a one row update isn't your cup of tea, there are a
few more
options.

Work the database so that when a client updates a particular database, a
time stamp is placed on that row when the last update was done.  A trigger
would do the trick quite easily.  The client would then make a request from
the database to request all data starting from the date/time of last
request.  Client/Server time differences won't be an issue as you'd be
reading the time from the rows and keeping tabs on the date in that new
update field and not on the client system time.

There are pros and cons to this;
- Con: Deleted records would be perhaps a trick.  You wouldn't want to
delete any data as the clients would then never be aware of the delete. So
this has your first serious impact on database size.  You'd also have to
add another field to each table to indicate the state of the row, if it
should be deleted, or if it should be considered valid, but this will put
another impact on database size.  From there you can provide the client
with proper inserts and proper deletes.  Another option is to put the
literal "delete" statements in another table, with appropriate server time
stamp, and when the client makes a request for an update, return the
results ordered on this time stamp.
- Pro: UNDELETE becomes possible with the first CON added.
- Con: If CASCADE deletes are part of the schema, and you rely on deleting
one row to remove several other rows from other tables, it'll be serious
SERIOUS fun to work through that.  If the first CON is done, I suppose a
trigger could be done to flag the state row and just not allow the delete
to happen.  Then, when you actually want to do the database cleanup, you
temporarily turn off the triggers, do your thing, then turn them back on.
- Con: No guarentee that the returned SQL code would be in order of
execution to make things valid on the client side.  Ordering by the time
stamp, even if the millisecond is considered, if two fast inserts happen
but need to happen in a certain order, things could get messed up for the
client.

The second option is to open a socket connection, not for raw data
transmissions, but for sending SQL code from server to client.  On "First
contact" from a client, two things happen.  First, a socket is opened up to
retrieve the raw database your backup API took care of, but no data
transfers happen until a second socket is opened up to listen for SQL
code.  With the two sockets open, you start downloading the raw data on the
first socket, while caching SQL code on the second.  Once the first socket
is done, you attach a database connection to that data, then start pushing
SQL code to the database.

Again, pros and cons;
- Con: You're still dealing with the *token ring* network on first load,
which is going to cause some grief
- Pro: The client only needs to deal with the second socket.  It only ever
has to listen for commands.
- Con: You're still dealing with the *token ring* network........ .. hell,
that fact doesn't even apply to SQLite.. WHY TOKEN RING MAN!?!?!
- Con: A major revamp of code (Not database) will have to happen on both
the server and client to get the backup and streaming sockets to work and
have the clients aware of what is happening.  You're already using sockets,
so you have the knowledge to get somewhere with this.
- Con: The server is not only going to have to handle the SQL code on
itself, but, it'll also have to hand the code over to 'X' number of clients.
- Con: Establishing which client gets what database might be a bit hard to
do unless you start doing some kind of authentication, which would mean
that the entire database gets put over to the client anyways
- Pro: You'd be using MY IDEA and I wanna see it work when you get it done
like that. ;)
- Con: You'll have to pay for the plane ticket

A third option would be to treat the data similar to how other major SQL
servers handle the data.  You could create an ODBC driver, or build your
own internal protocols that have the client push information to the server,
then the server does the work against that query, then pushes the results
back in a format understood by the client.
- Pro: No "database" on the client side.
- Con: If the client is already expecting a database, a major overhaul is
going to be needed to be done.
- Pro: Depending on the volume of data, the* token ring* network bottle
neck only happens when TWO people make a request, not just one. ;)
- Con: The server code would probably have to be converted into a
multi-threaded application to handle several requests at one, and SQLite
isn't all that great with multi-threading (Not many applications are, and
here we are with multi-core desktop work stations for several years now).
Maybe I shouldn't say "isn't all that great" and gear it towards more like
"You better know what you're doing".

A fourth option would be to take the third suggestion, but remove SQLite
from the picture and just use a major DBMS as your server.  Use SQLite on
the client for faster data access and periodically hit the DBMS for
updates, etc.  ALL major database engines, however, come with a real world
price that'll hit the bank.  MySQL may say "free" on the package, but you
need a license for particular uses.  I don't know the legal talk or the
terms and conditions, but where I work, when we install it on computers we
ship to customers sites, we have to pay for a license for that computer for
some reason.  I also don't know if we pay on a per-application purpose
(Meaning that we can have many different 'databases' on the machine, but
the one application package the servers run is what is being licensed to
all those databases) or if its per CPU or if its per motherboard, or if its
per site, or what.
- Pro: You no longer have to worry about server side code base
- Con: It'll cost you money, use case depending
- Pro: Not re-inventing the wheel.

Please note that I've NEVER done ANY of the above, but I've let those ideas
rattle in my head for a very long time, and have considered impact, code,
and other things, and I'm 1,000% positive I've missed important facets of
each idea that'll hinder or help you along.  I have often wanted to build a
SQLite database server, but with MySQL and the purposes I use it for, why
bother?  Back in the 386 days, I wanted to build my own database engine
before I even knew the "Structured Query Language" even existed, and to be
honest, I don't know if SQL did exist in the 386 days.  But when I found
out how to use SQL, the idea of building my own database went out the door.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to