Re: speeding up queries (MySQL faster)

2004-08-27 Thread Yonik Seeley
FYI, this optimization resulted in a fantastic performance boost! I went from 133 queries/sec to 990 queries per sec! I'm now more limited by socket overhead, as I get 1700 queries/sec when I stick the clients right in the same process as the server. Oddly enough, the performance increased, but

Re: speeding up queries (MySQL faster)

2004-08-22 Thread Yonik Seeley
> For example, Nutch automatically translates such > clauses into QueryFilters. Thanks for the excellent pointer Doug! I'll will definitely be implementing this optimization. If anyone cares, I did a 1 minute hprof test with the search server in a servlet container. Here are the results (sorry

Re: speeding up queries (MySQL faster)

2004-08-22 Thread Doug Cutting
Yonik Seeley wrote: Setup info & Stats: - 4.3M documents, 12 keyword fields per document, 11 [ ... ] "field1:4 AND field2:188453 AND field3:1" field1:4 done alone selects around 4.2M records field2:188453 done alone selects around 1.6M records field3:1 done alone selects around 1K record

Re: speeding up queries (MySQL faster)

2004-08-22 Thread Yonik Seeley
Oops, CPU usage is *not* 50%, but closer to 98%. This is due to a bug in CPU% on RHEL 3 on multiprocessor CPUS (I ran run multiple threads in while(1) loops, and it will still only show 50% CPU usage for that process). The agregated (not per-process) statistics shown by top are correct, and they s

Re: speeding up queries (MySQL faster)

2004-08-21 Thread Bernhard Messer
Yonik, there is another "synchronized" block in CSInputStream which could block your second cpu out. Do you think there is a chance to recreate the index (maybe a smaller subset) without compound file option enabled and run your test again, so that we can see if this helps ? regards Bernhard Ot

Re: speeding up queries (MySQL faster)

2004-08-21 Thread Otis Gospodnetic
Ah, you may be right (no stack trace in email any more). Somebody recenly identified a few bottlenecks that, if I recall correctly, were related to synchronized blocks. I believe Doug committed some improvements, but I can't remember which version of Lucene that is in. It's definitely in 1.4.1.

Re: speeding up queries (MySQL faster)

2004-08-20 Thread Yonik Seeley
--- Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > The bottleneck seems to be disk IO. But it's not. Linux is caching the whole file, and there really isn't any disk activity at all. Most of the threads are blocked on InputStream.refill, not waiting for the disk, but waiting for their turn into

Re: speeding up queries (MySQL faster)

2004-08-20 Thread Yonik Seeley
--- Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > The bottleneck seems to be disk IO. But it's not. Linux is caching the whole file, and there really isn't any disk activity at all. Most of the threads are blocked on InputStream.refill, not waiting for the disk, but waiting for their turn into

Re: speeding up queries (MySQL faster)

2004-08-20 Thread Otis Gospodnetic
The bottleneck seems to be disk IO. Since this is a read-only index, why not spread some of the frequently scanned index files over multiple disks, or put the index on SCSI disks hooked up in a RAID. Maybe this is already the case, but you didn't mention in. Oh, I already answered a similar quest

speeding up queries (MySQL faster)

2004-08-20 Thread Yonik Seeley
Hi, I'm trying to figure out how to speed up queries to a large index. I'm currently getting 133 req/sec, which isn't bad, but isn't too close to MySQL, which is getting 500 req/sec on the same hardware with the same set of documents. Setup info & Stats: - 4.3M documents, 12 keyword fields per do