Is there a way of parallelizing URLFiltering over multiple threads? After all, the URLFilters themselves must already be thread-safe, or else they would have problems during fetching.
The reason why I'm asking is I have a custom URLFilter that needs to make calls to the DNS resolver, and multi-threading the URLFiltering would greatly speed up some filtering procedures that, unlike fetching, appear to be single-threaded: "mergedb -filter", inject, generate, "updatedb -filter" etc. (The most important is of course "generate" or, even better, "updatedb -filter" to prevent undesired URL's to reach the crawldb in first place). Enzo ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
