On 23/02/2009 5:14 PM, Kim Boulton wrote:
> Hello,
> 
> Thanks
> 
> The grep regex on the text file found around 10,000 lines over 5 minutes 
> (out of a total possible 200,000 rows), at which time I stopped it, 
> interesting experiment anyway :-)

Uh-huh ... so you'd estimate that it would take 5 minutes * (200K rows / 
10k rows) = 100 minutes to get through the lot, correct?

I tried an experiment on a 161Mb CSV file with about 1.1M 
name-and-address-etc rows in it. Because none of the patterns in your 
query are likely to match my data, I added an extra pattern that would 
select about 22% of the records (ended up with 225K output rows), 
putting it at the end to ensure it got no unfair advantage from a regex 
engine that tested each pattern sequentially.

BTW, I had to use egrep (or grep -E) to get it to work.

Anyway, it took about 6 seconds. Scaling up by number of input records: 
6 * 30M / 1M = 180 seconds = 3 minutes. Scaling up by file size: 6 * 500 
/ 161 = 19 seconds. By number of output rows: 6 * 200 / 225 ... forget 
it. By size of output rows: ... triple forget it.

Conclusion: something went drastically wrong with your experiment. 
Swapping? Other processes hogging the disk or the CPU? A really duff grep??

Anyway, here's my environment: 2.0 GHz single-core AMD Turion (64 bit 
but running 32-bit Windows XP SP3), using GNU grep 2.5.3 from the 
GnuWin32 project; 1 GB memory.

Cheers,
John
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to