On 23/02/2009 5:14 PM, Kim Boulton wrote: > Hello, > > Thanks > > The grep regex on the text file found around 10,000 lines over 5 minutes > (out of a total possible 200,000 rows), at which time I stopped it, > interesting experiment anyway :-)
Uh-huh ... so you'd estimate that it would take 5 minutes * (200K rows / 10k rows) = 100 minutes to get through the lot, correct? I tried an experiment on a 161Mb CSV file with about 1.1M name-and-address-etc rows in it. Because none of the patterns in your query are likely to match my data, I added an extra pattern that would select about 22% of the records (ended up with 225K output rows), putting it at the end to ensure it got no unfair advantage from a regex engine that tested each pattern sequentially. BTW, I had to use egrep (or grep -E) to get it to work. Anyway, it took about 6 seconds. Scaling up by number of input records: 6 * 30M / 1M = 180 seconds = 3 minutes. Scaling up by file size: 6 * 500 / 161 = 19 seconds. By number of output rows: 6 * 200 / 225 ... forget it. By size of output rows: ... triple forget it. Conclusion: something went drastically wrong with your experiment. Swapping? Other processes hogging the disk or the CPU? A really duff grep?? Anyway, here's my environment: 2.0 GHz single-core AMD Turion (64 bit but running 32-bit Windows XP SP3), using GNU grep 2.5.3 from the GnuWin32 project; 1 GB memory. Cheers, John _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users