Byron said:

> Since your running Debian, can you confirm your
> java_home points to 1.4.2 and not Kaffe for both Nutch
> & Tomcat?

Yes, sure of this (nothing with locate or which to kaffe 
and the environment variable seems correct).
 
> If you have corruption, you may want to start over. 
> My laptop runs quicker queries on 300k pages than this
> server yields results.

Wow. FWIW I first tried bin/nutch segread -fix but that didn't
fix the corrupted segment (has been another report of that here,
even when I deleted all index files as was also reported here). I
then tried bin/nutch segslice -fix and that indeed worked
(created two segments, one of which had zero size and the other
was fine. (Oh -- figured out if corrupted with bin/nutch segread
-list).

But, even with bin/nutch segslice -fix it would seem that
at least I would need to refetch -- is this correct?

> Was your crawl/fetch performing terribly as well or
> just queries?

Hm. Not sure how the crawl fetch should go; took about 24
hours for 300,000 sites (doubtless depends on the files in
conf -- I can live with that speed). But, queries take up to
10-15 seconds.

    - Bill

> 
> -byron
> 
> --- Bill Goffe <[EMAIL PROTECTED]> wrote:
> 
> > Hello -
> > 
> > I'm experiencing slow searches. Here's the
> > specifics:
> >   - Search example:
> > http://rfe.org/search.jsp?query=wealth+of+nations
> >     reliably takes 11 seconds
> >   - ~300K pages in the database (used mergesegs w/
> > indexing on my three 
> >     segments; one was found partially corrupted)
> >   - Dual 2.80GHz Xeon machine with 3 gig RAM and
> > SCSI disks (hardware RAID?)
> >   - Nutch 0.7.1
> >   - JAVA_OPTS="-Xmx1024m -Xms512m" (doesn't seem to
> > matter)
> >   - Tomcat 5.5.9 (minProcessors="5"
> > maxProcessors="75" in my connector
> >     for proxying in server.xml)
> >   - Java(TM) 2 SDK, Standard Edition Version 1.4.2
> >   - Linux (Debian) with 2.4.27-2-686-smp kernel
> > 
> > When I monitor the search with htop (a _nice_
> > replacement for top -- much
> > easier to kill or renice jobs in it than top, and
> > can easily view parent
> > and child processes and sort views different ways) I
> > see 41 processes
> > (seems like a lot?) started by Tomcat. Memory usage
> > for each goes to ~200M
> > after a search of the above from about 64K at Tomcat
> > startup (even on a
> > single word search it goes to ~150M).
> > 
> > I didn't see anything obvious in nutch-default.xml
> > to fiddle with nor
> > anything that really seemed apropos in the list
> > archive (other than others
> > seem to get much faster searches). Any suggestions?
> > 
> >          - Bill
> > 
> > -- 
> >         
> >
> *------------------------------------------------------*
> >          | Bill Goffe                
> > [EMAIL PROTECTED]          |
> >          | Department of Economics    voice: (315)
> > 312-3444     |
> >          | SUNY Oswego                fax:   (315)
> > 312-5444     |
> >          | 416 Mahar Hall            
> > <http://cook.rfe.org>     |          
> >          | Oswego, NY  13126                        
> >            |
> >
> *--------*------------------------------------------------------*-----------*
> > | "Some predicted the disclosure would set off
> > strong reactions from        |
> > |  governments of the target countries."            
> >                        |
> > |   -- A description of how China, Russia, Iraq,
> > North Korea, Iran, Libya   |
> > |      and Syria might feel about the revelation
> > that the U.S. has          |
> > |      contingency plans to use nuclear weapons
> > against them. "U.S. Works   |
> > |      Up Plan for Using Nuclear Arms," Paul
> > Richter, LA Times,             |
> > |      March 9, 2002.                               
> >                        |
> >
> *---------------------------------------------------------------------------*
> > 
> > 

-- 
         *------------------------------------------------------*
         | Bill Goffe                 [EMAIL PROTECTED]          |
         | Department of Economics    voice: (315) 312-3444     |
         | SUNY Oswego                fax:   (315) 312-5444     |
         | 416 Mahar Hall             <http://cook.rfe.org>     |          
         | Oswego, NY  13126                                    |
*--------*------------------------------------------------------*-----------*
| "Students without a bedroom television scored an average of about 63 on   |
| the mathematics section of the test, while students with a bedroom TV     |
| scored an average of about 53 (P<0.001)."                                 |
|  -- A study on the impact of TVs in the bedrooms of third graders.        |
|     "Bedroom TV Associated With Lower Achievement Scores," Jeff Minerd,   |
|     http://www.medpagetoday.com/Pediatrics/Parenting/tb/1303>.            |
*---------------------------------------------------------------------------*



-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc.  Get Certified Today
Register for a JBoss Training Course.  Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to