At 10:38 AM -0700 6/11/00, Ravindra Wankar wrote:
>Phrase match seems very very slow (as compared to "all words" and "any
>words").

Strange. I notice a small slowdown, but not much.

>Also, when running htdig, initially htdig takes up 97-98% of CPU time.
>Memory usage is high but I don't see swapping. After a while the cpu
>usage drops to around 40%. Mem is still fine.

Yes, the word database code still needs some optimization. Profiling 
the code has shown that this is the major bottleneck. If you fiddle 
with the cache size, performance improves, but it's silly to cache 
the whole database. ;-)

>Similarly when htsearch is run I see almost 90-95% CPU usage. What
>happens if there are 10 simultaneous searches?

Right, but you see high CPU usage when you run htsearch in previous 
versions too. Basically all of the programs are designed to run with 
as much CPU as you give them... When I actually finish rewriting the 
htsearch backend rewrite, it will be possible to cache search results 
and intermediate results (i.e. part of a query). You *could* do it 
now, but the code would be a total mess.

>Would moving to MYSQL DB help? I don't see a patch for 3.2 versions.

Not really. A SQL database might help speed up the document indexes 
slightly, but the word database in SQL would be massive. So you may 
or may not have a performance increase for the word database, but I'm 
very confident you'd have a much bigger database.

>Does anyone know what is/are the bottlenecks? Disk/Mem/CPU? e.g. given
>the above configuration, what can be changed to speed things up?

You will get better disk performance if you use a SCSI disk. This is 
a significant bottleneck however you cut it and will probably remain 
one. The fewer times you need to hit the disk, the better.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

Reply via email to