Implement a filter mechanism that allow intecepting every stage of a crawling 
process
-------------------------------------------------------------------------------------

                 Key: DROIDS-58
                 URL: https://issues.apache.org/jira/browse/DROIDS-58
             Project: Droids
          Issue Type: New Feature
    Affects Versions: 0.01
            Reporter: Mingfai Ma


refer to this:
http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200906.mbox/%[email protected]%3e

assume the process is 
1. poll a link from queue
2. fetch entity
3. parse entity
4. extract outlinks

we provide a mechanism to intercept the process in every stage. e.g. a 
LinkFilter has a  "public T polled(T link);" interface, any filter may reject 
or transform a Link polled from the queue. similar logic applies to fetching, 
parsing, and extracting (outlink) 



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to