> So if I can summarize what you're seeing:
> 1. Indexing speed is fine. (No noticeable difference
> from 3.1.x)
> 2. Search speed soon after re-inedxing is slow.
> 3. After a day or so, search speed is lightning fast.
> (Faster than 3.1.x)
>
> Do I have that about right?

#1 You have that right.  The I would say the dig runs roughly the
same speed - maybe a wee-bit slower.  Worst case, (very worst)
would be that it _might_ run 10 or 15 percent slower.   Remember,
that would be a worst case estimate.  Keep in mind that I index
70,000 - 100,000 pages each week, (vary's wildly) so I can only
go by 'feel'.  But if it was twice as slow, I'd definately know
it! :)

Let me tie this into an earlier bug.  As you'll recall, I had
problems with the dig hanging on some domains - making the dig a
real nightmare for me.  I would spend a lot of time having to dig
until I hang, delete the offending domain, start over, on & on &
on until it'd run.  That has definately been fixed.  Now, during
the dig, I can actually see these offending domains, because
there is a longer than usual pause before the timeout hits &
things move on.  If that specific code can have a quicker
timeout, the time I might be losing earlier, would be more than
gained back by quicker timeouts in that specific code-fix.

#2 & #3 Yes, after I have thought about it about, that's what I'm
actually seeing.  Does that make sense?  Is this possible??  As
an observer from the outside, this appears to be the case.  Does
it make sense from "inside" linux (redhat 6.2, btw).

> Yes, you're right that most systems will cache large
> parts of the
> databases. Even with 3.1.5, I find that a follow-up
> search is much faster
> than the original since parts of the DB are cached by
> the OS. But how big
> are your DB? Why does it take so long for caching to hit?

Here are my files:

-rw-rw-r--   1 root     root     24002560 Jul 30 08:43 db.docdb
-rw-rw-r--   1 root     root      5238784 Jul 30 08:43
db.docs.index
-rw-rw-r--   1 root     root     147042304 Jul 30 08:48
db.excerpts
-rw-rw-r--   1 root     root     298757120 Jul 30 08:46
db.words.db
-rw-rw-r--   1 nobody   nobody    2723840 Jul 30 08:48
db.words.db_weakcmpr
-rw-r--r--   1 root     root      2215936 Jul 30 08:48
root2word.db
-rw-r--r--   1 root     root      2920448 Jul 30 08:48
word2root.db

I have no explanation or ideas as to why it may take numerous
searches before sufficient data gets cached to speed things up.
I have 512Meg Ram, dual 600's under the hood.

-Chris
>
>
> ------------------------------------
> To unsubscribe from the htdig3-dev mailing list, send
> a message to
> [EMAIL PROTECTED]
> You will receive a message to confirm this.
>
>


------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] 
You will receive a message to confirm this. 


Reply via email to