Update of /cvsroot/nutch/nutch/src/java/net/nutch/quality
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv4633/src/java/net/nutch/quality
Modified Files:
QualityTestTool.java
Log Message:
Full commit for Nutch distributed WebDB.
This is a lot of new code that implements the multi-machine
web database. This means we should be able to update the db
with multiple CPUs and disks simultaneously. (This has been
a major bottleneck for us so far.)
This commit also contains files for the NutchFileSystem, which
is a rudimentary distributed file system. The Distributed WebDB
is built on top of NutchFS. There are two implementations of
NutchFS: one for machines mounting NFS (network file system), and
one for machines that need to use a remote SSL connection, The
former is well-tested, but the latter is still a little sketchy.
I've done what little testing I can do on my laptop. I'm putting
code back so that other people can take a look, and so we can put
it on multiple machines.
Note that I've put changes back to the files "DistributedWebDBWriter"
and "DistributedWebDBReader". These are meant to replace "WebDBWriter"
and "WebDBReader," but I didn't want to disturb the source base
until the distributed code is tested further.
Index: QualityTestTool.java
===================================================================
RCS file: /cvsroot/nutch/nutch/src/java/net/nutch/quality/QualityTestTool.java,v
retrieving revision 1.8
retrieving revision 1.9
diff -C2 -d -r1.8 -r1.9
*** QualityTestTool.java 3 Jul 2003 12:36:51 -0000 1.8
--- QualityTestTool.java 30 Jan 2004 22:11:43 -0000 1.9
***************
*** 77,81 ****
LOG.info("CreateInputs, 1 of 6: Copying query list...");
File targetQueryList = new File(inputsDir, QUERY_LIST);
! FileUtil.copyContents(queryList, targetQueryList);
//
--- 77,81 ----
LOG.info("CreateInputs, 1 of 6: Copying query list...");
File targetQueryList = new File(inputsDir, QUERY_LIST);
! FileUtil.copyContents(queryList, targetQueryList, true);
//
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
Nutch-cvs mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-cvs