I had a few suggestions and questions earlier in this thread that I don't think have been responded to.
And yes, without seeing the source it will be difficult to make more suggestions. On Apr 2, 2014 4:57 PM, "Kevin Xu" <accol...@gmail.com> wrote: > The app seems to use between 60-80% CPU while it is running (from Activity > Monitor) while disk use (using sudo iotop -P on OSX) seem to suggest about > 20%-30% I/O use. (spikes to over 30million% every 10s or so, might be when > sqlite decides to page out) > > Clearly my computations on the data are not complex enough to max out the > processor (the bulk inserts, at least, is single threaded) while iotop > results suggests I/O isn't maxed out either, which is why I suspect > performance improvements are still possible. > > On the other hand, it is difficult for anyone to suggest more ideas > without seeing the actual source code (professor wants it private for now), > so I will try coding up the virtual table, and pure binary file dump (as a > baseline) to see if I discover any more issues. > > Kevin Xu > > > On Apr 2, 2014, at 1:28 PM, Keith Medcalf <kmedc...@dessus.com> wrote: > > > On Tue, 1 Apr 2014 20:58:14 -0700 > > Kevin Xu <accol...@gmail.com> wrote: > > > >> I am trying to see if it is possible to achieve even higher > >> throughput - my professor was adamant that he managed > >> to insert 3 million rows in 7 seconds (which works out to > >> over 420K inserts per second) though he could not find > >> his code that did it or records of his insertions. > > > >> When I profiled the application (using Instruments), and after > >> inverting the call tree, the time spent within the program is > >> broken down as follows: > > > >> 2170ms 15.1% - sqlite3VdbeExec -> sqlite3_step -> insert function > >> 2142ms 14.9% - pwrite -> unixWrite -> pager_write/pager_write_pagelist > >> 1925ms 14.9% - std::string::insert -> boost::spirit > >> 539ms 3.7% - pack (my compression function) > > > > A question that nobody seems to have bothered to ask, and that will > > entirely direct your solution search: > > > > Is the limit I/O or CPU? > > > > That is, which has hit 100% usage, CPU or I/O? If is is I/O then you > > might want to look for consumers of I/O. The best way to make I/O go > > faster is not to do it. > > > > If the limit is CPU, then you need to devise a way to consume less CPU > > (or get a faster CPU) or to get more processors and parallelize your > > processing. > > > > Secondly, what is the headroom you have available? For example if you > > are using 100% CPU and 97% I/O, then you are damn close to balanced and > > the additional couple of K per second you can gain in I/O is unlikely > > to make any significant difference, and will likely end up being the > > bottleneck even if you spend a couple of million dollars on CPU and > > MEMORY (to drive CPU usage down to 0.0001% but peg I/O to 100%). > > > > If the problem is CPU, then the first place to attack is boost:spirit > > (or use multiprocessing) -- after checking to ensure that you I/O > > system is properly offloaded from the main CPU and not consuming spin > > cycles while doing I/O, of course. > > > > If neither I/O nor CPU is pegged at 100% then you have a crappy > > scheduler or simply insufficient overlap between I/O and compute, and > > you need to fix this first. > > _______________________________________________ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users