Yeah, i rolled a release with 1.4 2 weeks ago i think and i was able to spider however i ran across strange errors and at one point i did something that kept me from ever querying the data.. i noticed that under 1.4 it seems the analyze db was much faster. not sure what that was.
I do appreciate your help! I'm getting all my notes together on the configuration and setup so i'll be happy to help out with documentation and as i get more into the code i'll be helping out there. Thanks again! -byron --- Doug Cutting <[EMAIL PROTECTED]> wrote: > Byron Miller wrote: > > Is there any recommended heap size or jvm > properties > > that someone has come up with for optimal > performance? > > Searching shouldn't require a huge Java heap. The > majority of the RAM > should be either left to the OS to use as a > filesystem cache for index > files, or, perhaps, as a RAM FS. > > > We are looking at bumping up our servers to 16 > gigs of > > memory a piece for our core systems as our cost is > > facilities and management and with the opterons > being > > able to use tons of memory efficiently its the > best > > value for us. > > If you can afford to have a 16Gb machine for every > 8M documents, then > you may have room to place the index in a RAM FS, > which makes search > quite fast. If you can't quite fit all of the index > files in the RAM > FS, the most important are the .tis, .frq, .prx, in > that order. In some > experiments that Ben did, the kernel's filesystem > cache eventually > performed nearly as well as using a RAM FS, but it > took a while for a > the cache to get warm. > > Nutch currently uses Lucene 1.3. There are > optimizations in the Lucene > 1.4 codebase which should make most Nutch searches > significantly faster. > However, there are bugs in the Lucene 1.4RC2 > release that will affect > Nutch. So, if you want to try Lucene 1.4, to see if > it helps > performance, use the latest CVS, which has fixes for > the known bugs > relevant to Nutch. (I intend to make a Lucene > 1.4RC3 release ASAP, so > you could also just wait and try that.) > > Cheers, > > Doug > > > ------------------------------------------------------- > This SF.Net email is sponsored by: Oracle 10g > Get certified on the hottest thing ever to hit the > market... Oracle 10g. > Take an Oracle 10g class now, and we'll give you the > exam FREE. > http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click > _______________________________________________ > Nutch-developers mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/nutch-developers ------------------------------------------------------- This SF.Net email is sponsored by: Oracle 10g Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
