On Fri, Aug 5, 2016 at 2:03 PM, Dan Kennedy <danielk1...@gmail.com> wrote:
> On 08/06/2016 03:28 AM, Kevin O'Gorman wrote: > >> On Fri, Aug 5, 2016 at 1:08 PM, David Raymond <david.raym...@tomtom.com> >> wrote: >> >> ...... >> >> Apart from the default location of the files, it reads like your next main >>> concern is how many temp files get opened up. My bet is that it'll be a >>> very small number, just potentially huge in file size while it's doing >>> its >>> thing. But again, try that pragma and take a look. >>> >>> My best bet is the contrary: it starts with small files and makes >> increasingly larger ones, like the sort utility does. The problem is that >> there are too many of them at the beginning for it to work with anonymous >> files (which sort does not use). This at least offers a possible >> explanation of its getting wedged on large indexes: an unexpected and >> untested error, handled poorly. >> > > You could verify this by checking the number of open handles in > "/proc/<pid>/fd" after your process is wedged. > > Excellent idea. I did not know about that possibility. And sure enough, I'm wrong. It's using anonymous files, all right, but only one or two at a time. I assume they're big. I'm in the process of bracketing where size begins to matter. So far, 1/10 of the data loads and indexes just fine, albeit somewhat more slowly that the smaller samples predicted. The database load took 6.5 minutes, the troublesome index 10 minutes. At smaller sizes, indexing is faster than the database load. I'm trying 1/3 now (500 million lines) -- #define QUESTION ((bb) || (!bb)) /* Shakespeare */ _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users