thnx sir search.jsp worked out..but the plugin is not working i
suppose,...because.the meta   data content is present and it is not
retrieveing it

even i did a segment read of the html documents it shows meta data in
content but the Recommended field is not working in the search.how do i
search..using the Recommended field name

the output of the meta content is as follows

bin/nutch readseg -get arpit11/segments/20070624072952 "
http://localhost:8080/110.html";
SegmentReader: get 'http://localhost:8080/110.html'
Content::
Version: 2
url: http://localhost:8080/110.html
base: http://localhost:8080/110.html
contentType: text/html
metadata: Content-Length=1517 Connection=close ETag=W/"1517-1182650258000"
nutch.segment.name=20070624072952 nutch.crawl.score=1.0
nutch.content.digest=f684040b7a5b17219926d69b595c56de Date=Sun, 24 Jun 2007
02:00:24 GMT Server=Apache-Coyote/1.1 Content-Type=text/html
Last-Modified=Sun, 24 Jun 2007 01:57:38 GMT
Content:
<HTML><HEAD> <TITLE> Methods of mathematical physics. Vol-1
</TITLE>
<meta name="Title"content=" Methods of mathematical physics. Vol-1
"> <meta name="Author"content=" COURANT(R)
">  <meta name="Publishers"content=" John Wiley
">
<meta name="Recommended"content="aaa">
</HEAD><BODY>
<BR> <B>     Title       :</B> Methods of mathematical physics. Vol-1
<BR> <B>     Main Ent    :</B> COURANT(R)
<BR> <B>     Add. Ent    :</B> HILBERT(D)
<BR> <B>     Publ. Plc.  :</B> New York
<BR> <B>     Publisher   :</B> John Wiley
<BR> <B>     Publ. Dt.   :</B> 1953
<BR> <B>     Pages       :</B> xvi+561
<BR> <B>     Notes       :</B> Raman collection
<BR> <B>     Class No.   :</B> 51-73<BR> <B>     Book No.    :</B> COU-1;1
<BR> <B>     Acc NO.     :</B> 781   C
<BR> <B>     Keywords    :</B> <methods ><mathematical physics ><>
<BR> <B>     Lib. Code   :</B> RRI
</BODY></HTML>

Crawl Generate::
Version: 5
Status: 1 (db_unfetched)
Fetch time: Sun Jun 24 07:29:01 IST 2007
Modified time: Thu Jan 01 05:30:00 IST 1970
Retries since fetch: 0
Retry interval: 30.0 days
Score: 1.0
Signature: null
Metadata: _ngt_:1182650392342

Crawl Fetch::
Version: 5
Status: 33 (fetch_success)
Fetch time: Sun Jun 24 07:30:24 IST 2007
Modified time: Thu Jan 01 05:30:00 IST 1970
Retries since fetch: 0
Retry interval: 30.0 days
Score: 1.0
Signature: f684040b7a5b17219926d69b595c56de
Metadata: _ngt_:1182650392342 _pst_:success(1), lastModified=0

Crawl Parse::
Version: 5
Status: 65 (signature)
Fetch time: Sun Jun 24 07:30:28 IST 2007
Modified time: Thu Jan 01 05:30:00 IST 1970
Retries since fetch: 0
Retry interval: 0.0 days
Score: 1.0
Signature: 5a009d3ca1bc7429d273cb3298a471c2
Metadata: null

ParseData::
Version: 5
Status: success(1,0)
Title: Methods of mathematical physics. Vol-1
Outlinks: 0
Content Metadata: Content-Length=1517 Connection=close
ETag=W/"1517-1182650258000" nutch.segment.name=20070624072952
nutch.crawl.score=1.0
nutch.content.digest=f684040b7a5b17219926d69b595c56deDate=Sun, 24 Jun
2007 02:00:24 GMT Server=Apache-Coyote/1.1
Content-Type=text/html Last-Modified=Sun, 24 Jun 2007 01:57:38 GMT
Parse Metadata: CharEncodingForConversion=windows-1252

ParseText::
Methods of mathematical physics. Vol-1 Title : Methods of mathematical
physics. Vol-1 Main Ent : COURANT(R) Add. Ent : HILBERT(D) Publ. Plc. : New
York Publisher : John Wiley Publ. Dt. : 1953 Pages : xvi+561 Notes : Raman
collection Class No. : 51-73 Book No. : COU-1;1 Acc NO. : 781 C Keywords :
<> Lib. Code : RRI


On 6/24/07, Doğacan Güney <[EMAIL PROTECTED]> wrote:

On 6/24/07, karan <[EMAIL PROTECTED]> wrote:
> hi sir
>
> i am using the nutch-0.9 version...
>
> the stack trace is as follows
> 2007-06-24 03:24:53,023 ERROR [jsp] - Servlet.service() for servlet jsp
> threw exception
> java.lang.NullPointerException
>         at org.apache.nutch.searcher.QueryFilters.filter(
QueryFilters.java
> :109)
>         at org.apache.nutch.searcher.IndexSearcher.search(
IndexSearcher.java
> :94)
>         at org.apache.nutch.searcher.NutchBean.search(NutchBean.java
:250)
>         at org.apache.jsp.search_jsp._jspService(search_jsp.java:259)
>         at org.apache.jasper.runtime.HttpJspBase.service(
HttpJspBase.java
> :98)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
>         at org.apache.jasper.servlet.JspServletWrapper.service(
> JspServletWrapper.java:328)
>         at org.apache.jasper.servlet.JspServlet.serviceJspFile(
> JspServlet.java:315)
>         at org.apache.jasper.servlet.JspServlet.service(JspServlet.java
:265)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
>         at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
> ApplicationFilterChain.java:269)
>         at org.apache.catalina.core.ApplicationFilterChain.doFilter(
> ApplicationFilterChain.java:188)
>         at org.apache.catalina.core.StandardWrapperValve.invoke(
> StandardWrapperValve.java:210)
>         at org.apache.catalina.core.StandardContextValve.invoke(
> StandardContextValve.java:174)
>         at org.apache.catalina.core.StandardHostValve.invoke(
> StandardHostValve.java:127)
>         at org.apache.catalina.valves.ErrorReportValve.invoke(
> ErrorReportValve.java:117)
>         at org.apache.catalina.core.StandardEngineValve.invoke(
> StandardEngineValve.java:108)
>         at org.apache.catalina.connector.CoyoteAdapter.service(
> CoyoteAdapter.java:151)
>         at org.apache.coyote.http11.Http11Processor.process(
> Http11Processor.java:870)
>         at
>
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection
> (Http11BaseProtocol.java:665)
>         at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(
> PoolTcpEndpoint.java:528)
>         at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(
> LeaderFollowerWorkerThread.java:81)
>         at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run
(
> ThreadPool.java:685)
>         at java.lang.Thread.run(Thread.java:595)
>

Is your plugin a query filter? If so, remember that you have to name
the fields that you are going to use in your plugin. For example, this
is part of plugin.xml file from query-url.

   <extension id="org.apache.nutch.searcher.url.URLQueryFilter"
              name="Nutch URL Query Filter"
              point="org.apache.nutch.searcher.QueryFilter">
      <implementation id="URLQueryFilter"
                      class="org.apache.nutch.searcher.url.URLQueryFilter
">
        <parameter name="fields" value="url"/>
    <!-- ^^^^^^^^^^^^^^^^^^^^^^^^ -->
      </implementation>

   </extension>

You have to define the parameters like the one above.


>
>
>
> On 6/24/07, Doğacan Güney <[EMAIL PROTECTED]> wrote:
> >
> > On 6/24/07, karan <[EMAIL PROTECTED]> wrote:
> > > hey...
> > >
> > > thnx for reply tomcat in run mode does generate exceptions at the
> > terminal
> > > :)..and the output shoes the plugin is in the registered list of
plugins
> > and
> > > the exception that is generated is as follows.
> > >  and the error message that is generated is
> > >
> > > 2007-06-24 03:14:47,918 ERROR [jsp] - Servlet.service() for servlet
jsp
> > > threw exception
> > > java.lang.NullPointerException
> >
> > There should be more to it... Can you please also send the stack
> > trace? Also, which version of nutch are you using?
> >
> > >
> > > dont have any idea why is this happenin :( can u plz help
> > >
> > >
> > >
> > > On 6/24/07, Doğacan Güney <[EMAIL PROTECTED]> wrote:
> > > >
> > > > On 6/24/07, karan <[EMAIL PROTECTED]> wrote:
> > > > > hi
> > > > >
> > > > > i just tried to build the recommended plugin that is given in
the
> > plugin
> > > > > writing example when i included the plugin in the
> > plugin.includesproperty
> > > > > the searc.jsp nothing is displayed just an empty html page
> > > > >
> > > > > i m not able to configure what is happening :( when i remove the
> > > > recommended
> > > > > plugin from there the search.jsp page is displayed normally
> > > >
> > > > If you are using tomcat, please start it in 'run'
mode(./catalina.sh
> > > > run) and check if tomcat prints an exception.
> > > >
> > > > >
> > > > > please help its really urgent
> > > > >
> > > >
> > > >
> > > > --
> > > > Doğacan Güney
> > > >
> > >
> >
> >
> > --
> > Doğacan Güney
> >
>


--
Doğacan Güney

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to