Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-03 Thread Shane Harrelson
Neither of these are your problem, but I noticed the following in your posted code: PRAGMA auto_vacuum=NONE; has no affect after your tables are created. You should move this setting earlier in your code. http://www.sqlite.org/pragma.html#pragma_auto_vacuum PRAGMA count_changes=OFF is

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-03 Thread Simon Slavin
On 3 Apr 2014, at 3:29am, Kevin Xu wrote: > I have not discovered how to find internal memory throughput usage in OSX, > and I agree that something is not allowing the system from maxing out the CPU > or I/O. A single application can max a single core of a single CPU if

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-03 Thread Dominique Devienne
On Thu, Apr 3, 2014 at 4:10 AM, Kevin Xu wrote: > I have put up my insertion code here: http://pastebin.com/yjRW2mh3 As mentioned by Stephan Beal, use SQLITE_STATIC instead of SQLITE_TRANSIENT for your text/blob binding. As long as the memory buffer you are binding stays

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-03 Thread Clemens Ladisch
Kevin Xu wrote: > The SQLite page_size is on its default setting (1024b), while I have > not dug out the details on my specific SSD, a cursory Google search > reveals SSDs report a 512b sector size for compatibility reasons while > internally implementing a different native sector size (which is >

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-03 Thread dd
Did you see any improvement if application runs with high priority(nice)? (CPU and I/O) On Thu, Apr 3, 2014 at 12:56 AM, Kevin Xu wrote: > The app seems to use between 60-80% CPU while it is running (from Activity > Monitor) while disk use (using sudo iotop -P on OSX) seem

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Kevin Xu
I have not discovered how to find internal memory throughput usage in OSX, and I agree that something is not allowing the system from maxing out the CPU or I/O. Also, the unix time command does reveal that part of the program’s running time is spent in the system, though I do not know yet how

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Kevin Xu
Hi Shane, The SQLite page_size is on its default setting (1024b), while I have not dug out the details on my specific SSD (Apple SSD SD512E, with early 2013 retina MBPs), a cursory Google search reveals SSDs report a 512b sector size for compatibility reasons while internally implementing a

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Keith Medcalf
Keith Medcalf; General Discussion of SQLite Database >Subject: Re: [sqlite] Improving Bulk Insert Speed (C/C++) > >The app seems to use between 60-80% CPU while it is running (from >Activity Monitor) while disk use (using sudo iotop -P on OSX) seem to >suggest about 20%-30% I/O use.

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Shane Harrelson
I had a few suggestions and questions earlier in this thread that I don't think have been responded to. And yes, without seeing the source it will be difficult to make more suggestions. On Apr 2, 2014 4:57 PM, "Kevin Xu" wrote: > The app seems to use between 60-80% CPU while

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Kevin Xu
The app seems to use between 60-80% CPU while it is running (from Activity Monitor) while disk use (using sudo iotop -P on OSX) seem to suggest about 20%-30% I/O use. (spikes to over 30million% every 10s or so, might be when sqlite decides to page out) Clearly my computations on the data are

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Stephan Beal
On Wed, Apr 2, 2014 at 6:57 PM, Dominique Devienne wrote: > runtime, and parsing differently, avoid std::string, might reap more > benefits. > A related tip not yet mentioned: if your strings will outlive their statements' next insertion step(), you can use SQLITE_TRANSIENT

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Eduardo Morras
On Wed, 2 Apr 2014 09:09:58 -0700 Kevin Xu wrote: > > On Apr 2, 2014, at 9:01 AM, Simon Slavin wrote: > > > > > On 2 Apr 2014, at 4:55pm, Kevin Xu wrote: > > > >> PRAGMA journal_mode=MEMORY; > > > > Since you're not using WAL

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Dominique Devienne
On Wed, Apr 2, 2014 at 6:09 PM, Kevin Xu wrote: > Yes, I have tried various combinations of WAL/OFF/MEMORY and they do not > appear to affect the insert speed in any meaningful manner (e.g. +/- 0.3s on > 500K inputs). > > [...] Thus, I am trying to see if there is anything I

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Shane Harrelson
What is your page size? Make sure it matches the sector size of your device. SSDs can be picky about write sizes. As an experiment, you might try using larger page sizes that are multiples of the sector size. Try to reduce the size of the records you are writing. Ie. can you map any string

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Richard Hipp
On Wed, Apr 2, 2014 at 12:02 PM, Kevin Xu wrote: > Richard Hipp wrote: > > > So you are creating a 27.5 GB database in under a half hour? > > > > What is your database page size? > > Yes, to be exact, a 34.24GB database from (34GB+34GB) FASTQ files in that > time. I did not

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Kevin Xu
On Apr 2, 2014, at 9:01 AM, Simon Slavin wrote: > > On 2 Apr 2014, at 4:55pm, Kevin Xu wrote: > >> PRAGMA journal_mode=MEMORY; > > Since you're not using WAL mode, try it using WAL mode. If you were already > using WAL mode, I'd suggest trying

[sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Kevin Xu
Richard Hipp wrote: > So you are creating a 27.5 GB database in under a half hour? > > What is your database page size? Yes, to be exact, a 34.24GB database from (34GB+34GB) FASTQ files in that time. I did not use any pragmas for swiching SQLite page size, so it should be the default (probably

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Simon Slavin
On 2 Apr 2014, at 4:55pm, Kevin Xu wrote: > PRAGMA journal_mode=MEMORY; Since you're not using WAL mode, try it using WAL mode. If you were already using WAL mode, I'd suggest trying with it off ! Given your answers to the questions so far, I can't think of anything else

[sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Kevin Xu
Simon Slavin wrote: > Just wanted to check your phrasing. You process a while file into INSERT > commands, then put > BEGIN before the first one, and END after the last one, thus making the whole > file a single > transaction ? If so, then should make SQLite do things about as quickly as >

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Richard Hipp
On Tue, Apr 1, 2014 at 11:58 PM, Kevin Xu wrote: > Hi everyone I have been working on a small bioinformatics project that > attempts to store FASTQ (http://en.wikipedia.org/wiki/FASTQ_format) data > into a SQLite database. > > The app works by reading in the FASTQ file

Re: [sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Simon Slavin
On 2 Apr 2014, at 4:58am, Kevin Xu wrote: > The app works by reading in the FASTQ file (through Boost::memory_mapped > regions), running the parser through it, binding the variables using > sqlite3_bind_xxx, and inserting them in a prepared statement wrapped within >

[sqlite] Improving Bulk Insert Speed (C/C++)

2014-04-02 Thread Kevin Xu
Hi everyone I have been working on a small bioinformatics project that attempts to store FASTQ (http://en.wikipedia.org/wiki/FASTQ_format) data into a SQLite database. The app works by reading in the FASTQ file (through Boost::memory_mapped regions), running the parser through it, binding the