Hi all,
I need to insert a lot of data (a million+ rows) very quickly and it must occur
on multiple threads. I do not have control over the creation and management of
the threads, because I am using sqlite within the context of a plugin: I
provide callback functions and the controlling process provides my functions
with thread id's and other relevant data.
I decided (perhaps naively) that the safest bet to avoid thread lock would be
to have each thread insert into its own database (with its own db handle)
during the heavy multi-threaded writing phase, then merge all the data into a
single database file at the end. I am using the special ":memory:" database as
that seemed ideal for my purposes: fast, private, and temporary. Everything
works, but I am getting a lot of thread locking, so much so that running on 8
threads it is about 5 times slower than when it runs on a single thread.
So my question is: if each thread is writing to its own database with its own
connection, why are the threads locking?
I've profiled it with Shark and the problem is definitely the mutex lock. Here
is the callstack:
sqlite3_step
sqlite3VdbeExec
sqlite3BtreeInsert
sqlite3BtreeMovetoUnpacked
getAndInitPage
sqlite3PcacheFetch
pcache3Fetch
pthreadMutexEnter
pthread_mutex_lock
All of the inserts for a thread are wrapped in a single BEGIN.. END.
I'm setting:
PRAGMA synchronous=OFF
PRAGMA journal_mode=OFF
One peculiar thing is that I see no difference in behavior, stability, or
performance when changing the sqlite3_open_v2() flags to include
SQLITE_OPEN_NOMUTEX or SQLITE_OPEN_FULLMUTEX.
The next step is to compile debug versions of everything and step through to
find out what is causing the lock, but before I go down that road, I'd like to
know if there isn't something else I should try first.
I'm using the sqlite that comes with osx 10.6, version 3.6.12.
thanks,
chad
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users