Check your dedup params.

It's ON by default and so it reduces number of total hits.
Consider other nutchBean.search functions.


Best Regards
Alexander Aristov


On 28 January 2011 08:56, .: Abhishek :. <[email protected]> wrote:

> Hi all,
>
>  I did a nutch crawl on a website and I am using the nutch war file to set
> up a simple search interface in Tomcat server. When I use the nutch web
> interface to search for a keyword, I see it first shows two results and
> says
> "1-2(out of about 400 total matching pages)" when I hit on the "show all
> hits" it shows all the results in paginated format.
>
>  Now, when I use the NutchBean to query for the same keyword, it just shows
> me the number of hits as 2. The code is as follows,
>
>  public static void main(String[] args) {
>
>        String searchString = "food";
>        Configuration nutchConfig = null;
>        NutchBean nutchBean = null;
>        Query nutchQuery = null;
>        Hits nutchHits = null;
>        try{
>            nutchConfig = NutchConfiguration.create();
>            nutchBean = new NutchBean(nutchConfig);
>            nutchQuery = Query.parse(searchString, nutchConfig);
>            nutchHits = nutchBean.search(nutchQuery);
>            System.out.println("Hits : "+nutchHits.getLength());
>            nutchBean.close();
>        } catch (IOException e) {
>            // TODO Auto-generated catch block
>            e.printStackTrace();
>        }finally{
>
>        }
>
>    }
>
>  Is there something wrong in the above code? why is it not showing the hits
> as 400 and just shows as 2?
>
> Thanks,
> Abhishek
>

Reply via email to