Yeah, i rolled a release with 1.4 2 weeks ago i think
and i was able to spider however i ran across strange
errors and at one point i did something that kept me
from ever querying the data.. i noticed that under 1.4
it seems the analyze db was much faster. not sure what
that was.

I do appreciate your help!  I'm getting all my notes
together on the configuration and setup so i'll be
happy to help out with documentation and as i get more
into the code i'll be helping out there.

Thanks again!
-byron

--- Doug Cutting <[EMAIL PROTECTED]> wrote:
> Byron Miller wrote:
> > Is there any recommended heap size or jvm
> properties
> > that someone has come up with for optimal
> performance?
> 
> Searching shouldn't require a huge Java heap.  The
> majority of the RAM 
> should be either left to the OS to use as a
> filesystem cache for index 
> files, or, perhaps, as a RAM FS.
> 
> > We are looking at bumping up our servers to 16
> gigs of
> > memory a piece for our core systems as our cost is
> > facilities and management and with the opterons
> being
> > able to use tons of memory efficiently its the
> best
> > value for us.
> 
> If you can afford to have a 16Gb machine for every
> 8M documents, then 
> you may have room to place the index in a RAM FS,
> which makes search 
> quite fast.  If you can't quite fit all of the index
> files in the RAM 
> FS, the most important are the .tis, .frq, .prx, in
> that order.  In some 
> experiments that Ben did, the kernel's filesystem
> cache eventually 
> performed nearly as well as using a RAM FS, but it
> took a while for a 
> the cache to get warm.
> 
> Nutch currently uses Lucene 1.3.  There are
> optimizations in the Lucene 
> 1.4 codebase which should make most Nutch searches
> significantly faster. 
>   However, there are bugs in the Lucene 1.4RC2
> release that will affect 
> Nutch.  So, if you want to try Lucene 1.4, to see if
> it helps 
> performance, use the latest CVS, which has fixes for
> the known bugs 
> relevant to Nutch.  (I intend to make a Lucene
> 1.4RC3 release ASAP, so 
> you could also just wait and try that.)
> 
> Cheers,
> 
> Doug
> 
> 
>
-------------------------------------------------------
> This SF.Net email is sponsored by: Oracle 10g
> Get certified on the hottest thing ever to hit the
> market... Oracle 10g. 
> Take an Oracle 10g class now, and we'll give you the
> exam FREE.
>
http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
> _______________________________________________
> Nutch-developers mailing list
> [EMAIL PROTECTED]
>
https://lists.sourceforge.net/lists/listinfo/nutch-developers



-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market... Oracle 10g. 
Take an Oracle 10g class now, and we'll give you the exam FREE.
http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to