Re: Orphaned file descriptors on /roller-index

Allen Gilliland Tue, 28 Jun 2005 23:01:21 -0700

I haven't really looked too much at the search code or done much workwith Lucene, but I would agree that it doesn't seem to make much senseto try and update the index immediately every time someone posts a newentry. I think a scheduled task that simply updates the entire indexperiodically, maybe every 30-60 minutes, would be fine.


-- Allen



Rudman Max wrote:

When you say "internal search engine" do you mean Lucene? I presumeby "external" you mean something like Google? At any rate, I thinkthe problem is either in Lucene or in that "edu.oswego" library youare using to schedule index writes. I think what's happening is thatat sustained high rate of posts the threads wanting to write toLucene index are piling up and causing the problem. Sadly, I don'tthink concurrency level needed to expose this bug is very high atall. I've been able to reproduce this with as few as 5 concurrentusers making posts.
My fear is that this is something fundamental to the way Lucenemanages its indexes and I suspect it's not an easy fix. I mean thisis the kind of problem that database vendors have to deal with prettysophisticated solutions. How receptive would you (and the community)be to a proposal for changing real-time writes to some sort ofbatched mode where a process is run periodically which is responsiblefor indexing un-indexed entries?
Max


On Jun 28, 2005, at 10:52 PM, Allen Gilliland wrote:
I'm not sure if we've ever isolated things down to exactly thatproblem, but for blogs.sun.com we've definitely had a number ofproblems with the built in search engine. I believe a number of theproblems have been fixed, so maybe if you aren't using the latestcvs already then you can run your tests against the 1.2 releasecoming up and see what happens. Unfortunately we won't be the besthelp with search problems because we use an external search engineand so we currently have Roller's built in search disabled.
-- Allen


Rudman Max wrote:
We've been testing Roller with some pretty high load (about 500concurrent users) running search (which reads from the index) andposts (which writes to the index) transactions. After a while,we'd run out of file handles which froze Tomcat because it couldno longer open sockets to accept incoming connections. Oursysadmin told me there were a bunch of orphaned file descriptorson files in /roller- index directory. I am not sure if that's thereason for Tomcat process running out of files but it seemslikely. Has anybody ever experienced this problem with Rollersearch index?
Max

Re: Orphaned file descriptors on /roller-index

Reply via email to