Not crawldb, and surely not entire files, but information about the indexes. If you modify directory information while files are still open by a process (e.g. by renaming a directory that contains them, and create a new directory with the old name) the process keeps accessing the original files on disk until it closes and reopens them (hence my question about mergesegs and mergedb).
----- Original Message ----- From: "Manoharam Reddy" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Monday, May 28, 2007 1:53 PM Subject: Re: Deleting crawl still gives proper results > The webapp caches the whole crawldb? Can anyone please tell me where > does it cache the whole crawldb? I don't think it is possible to cache > it on RAM. Is it cached in some location on the hard disk. > > Please clarify this point. > > On 5/27/07, Enzo Michelangeli <[EMAIL PROTECTED]> wrote: >> ----- Original Message ----- >> From: "Manoharam Reddy" <[EMAIL PROTECTED]> >> To: <[EMAIL PROTECTED]> >> Sent: Saturday, May 26, 2007 6:23 PM >> >> > After I create the crawldb after running bin/nutch crawl, I start my >> > Tomcat server. It gives proper search results. >> > >> > What I am wondering is that even after I delete, the 'crawl' folder, >> > the search page still gives proper search results. How is this >> > possible? Only after I restart the Tomcat server, it stops giving >> > results. >> >> The webapp seems to cache data. I have a related problem: updates to the >> indexes are only noticed after restarting Tomcat (so I have scheduled a >> nightly cron job to do that). >> >> Question for the Ones Who Know: in "bin/nutch mergesegs", can I use the >> same >> directory for input and output? >> >> For example: >> >> bin/nutch mergesegs crawl/segments -dir crawl/segments >> >> Same for mergedb: can I issue: >> >> bin/nutch mergedb crawl/crawldb crawl/crawldb >> >> At present I pass through temporary directories, and then I switch them >> in >> place of the old ones with a couple of "mv", but I don't know if that's >> necessary, or may even be harmful (for example, leaving the webapp, >> unaware >> of the "mv", pointing to the inode of the old directory). And I noticed >> that >> "bin/nutch mergedb" does not create the output directory until it's done, >> so >> I wonder if the explicit use of a temporary directory in my scripts is >> redundant. >> >> Enzo >> >> >> > ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
