Please send an email to [email protected].

Thanks,
Chris



On 7/13/10 9:47 PM, "amol....." <[email protected]> wrote:

Hi Admin,

Please remove from the list of this Nutch Group.

Thanks,
-Amol

On Tue, Jul 13, 2010 at 7:40 AM, jeff <[email protected]> wrote:

> Hi,
>
> Within JSParseFilter, there is an implementation of the filter function,
> which will be called when Html pages are to be filtered. I found the
> following code in the plugin.xml file under $NUTCH_HOME\src\plugin
> \parse-js directory:
>
> <extension id="org.apache.nutch.parse.js.JSParseFilter"
>              name="Parse JS Filter"
>              point="org.apache.nutch.parse.HtmlParseFilter">
>      <implementation id="JSParseFilter"
>         class="org.apache.nutch.parse.js.JSParseFilter">
>        <parameter name="contentType" value="application/x-javascript"/>
>        <parameter name="pathSuffix"  value=""/>
>      </implementation>
>   </extension>
>
> I assume this code determines that the filter function within
> JSParseFilter will be called.
>
> My questions are:
>
> 1) I don't see the implementation id of "JSParseFilter" is used in the
> parse-plugins.xml file under the $NUTCH_HOME\conf folder. Then how does
> Nutch knows that this filter function should be called?
>
> 2) I want to replace this filter with my own filter, and I wrote the
> follow code:
>
> <extension id="com.mycompany.nutch.parse.MyParseFilter"
>              name="Parse JS Filter"
>              point="org.apache.nutch.parse.HtmlParseFilter">
>      <implementation id="MyParseFilter"
>         class="com.mycompany.nutch.parse.MyParseFilter">
>      </implementation>
>   </extension>
>
> and put it into the plugin.xml file under $NUTCH_HOME\src\myplugin
> directory. But my filter is never called. Any ideas?
>
>
> Thanks a lot.
>
> Jeff
>
>


--
Regards ;

Amol Badgujar.



++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: [email protected]
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Reply via email to