Gal Nitzan wrote: > IMHO the data that is needed i.e. the data that will be fetched in the next > fetch process is already available in the <item> element. Each <item> element > represents one web resource. And there is no reason to go to the server and > re-fetch that resource.
Perhaps ProtocolOutput should change. The method: Content getContent(); could be deprecated and replaced with: Content[] getContents(); This would require changes to the indexing pipeline. I can't think of any severe complications, but I haven't looked closely. Could something like that work? Doug ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
