I don't use the nutch web application, but.... You don't have to start nutch in the searcher directory. You can set the location of the searcher dir within the nutch-site.xml config file.
Add this node and set the location of your index: <property> <name>searcher.dir</name> <value>/your/path/to/your/index</value> <description> Path to root of crawl. This directory is searched (in order) for either the file search-servers.txt, containing a list of distributed search servers, or the directory "index" containing merged indexes, or the directory "segments" containing segment indexes. </description> </property> On 7/19/07, Robert Young <[EMAIL PROTECTED]> wrote: > Tomcat only comes into it because we have to start Tomcat in the > searcher directory, I'm guessing it's the same however you choose to > use Nutch. It would still have to do a rename across physical volumes > if searcher.dir is set to something different would it not? > > How does this sound as a sollution? Allow the user to set a > configuration option setting the linkdb working dir, or allow the user > to set a configuration flag to use another particular configuration > option to set the base dir. Otherwise fall back to the default which > is the current working directory. > > Cheers > Rob > > On 7/19/07, Andrzej Bialecki <[EMAIL PROTECTED]> wrote: > > Robert Young wrote: > > > In org.apache.nutch.crawl.LinkDb on line 261 it creates a working > > > directory (newLinkDb) based on the current working directory. This > > > should be configurable rather than being based on where Tomcat was > > > started. I am planning on writing a patch to pull the hadoop.tmp.dir > > > setting if it is available, falling back to the current directory. > > > > > > Can anyone see any obvious problems with doing this? > > > > I'm not sure what Tomcat has to do with this. LinkDb does it this way in > > order to avoid rename() operation across physical volumes - if you > > invoke rename() on a local FS it may trigger a costly copy operation. > > > > > > -- > > Best regards, > > Andrzej Bialecki <>< > > ___. ___ ___ ___ _ _ __________________________________ > > [__ || __|__/|__||\/| Information Retrieval, Semantic Web > > ___|||__|| \| || | Embedded Unix, System Integration > > http://www.sigram.com Contact: info at sigram dot com > > > > > -- "Conscious decisions by conscious minds are what make reality real" ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers