On 08/06/2016 09:52 AM, Kevin O'Gorman wrote:
On Fri, Aug 5, 2016 at 2:03 PM, Dan Kennedy <danielk1...@gmail.com> wrote:

On 08/06/2016 03:28 AM, Kevin O'Gorman wrote:

On Fri, Aug 5, 2016 at 1:08 PM, David Raymond <david.raym...@tomtom.com>
wrote:

......

Apart from the default location of the files, it reads like your next main
concern is how many temp files get opened up. My bet is that it'll be a
very small number, just potentially huge in file size while it's doing
its
thing. But again, try that pragma and take a look.

My best bet is the contrary:  it starts with small files and makes
increasingly larger ones, like the sort utility does.  The problem is that
there are too many of them at the beginning for it to work with anonymous
files (which sort does not use).  This at least offers a possible
explanation of its getting wedged on large indexes: an unexpected and
untested error, handled poorly.

You could verify this by checking the number of open handles in
"/proc/<pid>/fd" after your process is wedged.

Excellent idea.  I did not know about that possibility.  And sure enough,
I'm wrong.  It's using anonymous files, all right, but only one or two at a
time.  I assume they're big.  I'm in the process of bracketing where size
begins to matter.  So far, 1/10 of the data loads and indexes just fine,
albeit somewhat more slowly that the smaller samples predicted.  The
database load took 6.5 minutes, the troublesome index 10 minutes.  At
smaller sizes, indexing is faster than the database load.

I'm trying 1/3 now (500 million lines)

What does [top] tell you once the process becomes wedged? What percentage is the CPU running at? Or is it completely bogged down waiting for IO?

Dan.

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to