Hi,

Within JSParseFilter, there is an implementation of the filter function,
which will be called when Html pages are to be filtered. I found the
following code in the plugin.xml file under $NUTCH_HOME\src\plugin
\parse-js directory:

<extension id="org.apache.nutch.parse.js.JSParseFilter"
              name="Parse JS Filter"
              point="org.apache.nutch.parse.HtmlParseFilter">
      <implementation id="JSParseFilter"
         class="org.apache.nutch.parse.js.JSParseFilter">
        <parameter name="contentType" value="application/x-javascript"/>
        <parameter name="pathSuffix"  value=""/>
      </implementation>
   </extension>

I assume this code determines that the filter function within
JSParseFilter will be called. 

My questions are:

1) I don't see the implementation id of "JSParseFilter" is used in the
parse-plugins.xml file under the $NUTCH_HOME\conf folder. Then how does
Nutch knows that this filter function should be called?

2) I want to replace this filter with my own filter, and I wrote the
follow code:

<extension id="com.mycompany.nutch.parse.MyParseFilter"
              name="Parse JS Filter"
              point="org.apache.nutch.parse.HtmlParseFilter">
      <implementation id="MyParseFilter"
         class="com.mycompany.nutch.parse.MyParseFilter">
      </implementation>
   </extension>

and put it into the plugin.xml file under $NUTCH_HOME\src\myplugin
directory. But my filter is never called. Any ideas?


Thanks a lot.

Jeff

Reply via email to