Re: Windows Share Crawling & searching

2007-08-21 Thread bikram
Hi all This is my crawl-urlfilter.txt == # The url filter file used by the crawl command. # Better for intranet crawling. # Be sure to change MY.DOMAIN.NAME to your domain name. # Each non-comment, non-blank line con

Re: IRC channel for Nutch?

2007-08-21 Thread Berlin Brown
I think Lucene would be better because you get lucene and nutch users. But nutch works also. On 8/21/07, Lyndon Maydwell <[EMAIL PROTECTED]> wrote: > Sounds like a good idea to me. > -- Berlin Brown http://www.newspiritcompany.com - newspirit technologies

Re: IRC channel for Nutch?

2007-08-21 Thread Lyndon Maydwell
Sounds like a good idea to me.

Re: how to update CrawlDB instead of Recrawling???

2007-08-21 Thread John Mendenhall
> http://today.java.net/pub/a/today/2006/02/16/introduction-to-nutch-2.htm > > The link above is not working... Change the extension from htm to html and it works. JohnM > Naess, Ronny wrote: > > > > Take a look at this article > > http://today.java.net/pub/a/today/2006/02/16/introduction-to-

extra directories in trunk

2007-08-21 Thread Smith Norton
In the trunk I can see some extra folders like 'contrib', 'site'. What are they? These are not present in nutch-0.9.

IRC channel for Nutch?

2007-08-21 Thread Smith Norton
Is there an IRC channel for nutch-users? If not, can we not join #nutch at irc.freenode.net and help each other out and discuss.

Re: Any patch for navigation of pages?

2007-08-21 Thread Susam Pal
Find my replies inline. On 8/21/07, Naresh Saxena <[EMAIL PROTECTED]> wrote: > I am unable to find out where to begin. Kindly please help me. > > 1. How can I build nutch-0.9.war after developing a patch and patching it? 'ant war' should do it. See the target name 'war' in 'build.xml'. > 2. Wher

Re: Any patch for navigation of pages?

2007-08-21 Thread Naresh Saxena
I am unable to find out where to begin. Kindly please help me. 1. How can I build nutch-0.9.war after developing a patch and patching it? 2. Where are the files for the Nutch web gui located in the source? On 8/21/07, Michael Wechner <[EMAIL PROTECTED]> wrote: > Naresh Saxena wrote: > > >hi, > >

Re: Any patch for navigation of pages?

2007-08-21 Thread Michael Wechner
Naresh Saxena wrote: hi, this is me posting again since I did not get any response. I know it is too quick to expect a response and probably I am being a little too impatient. but I would like to just get a "yes" or "no" from you all whether result page navigation links are available at all?

Re: Any patch for navigation of pages?

2007-08-21 Thread Andrzej Bialecki
Naresh Saxena wrote: hi, this is me posting again since I did not get any response. I know it is too quick to expect a response and probably I am being a little too impatient. but I would like to just get a "yes" or "no" from you all whether result page navigation links are available at all? If

Re: Any patch for navigation of pages?

2007-08-21 Thread Naresh Saxena
hi, this is me posting again since I did not get any response. I know it is too quick to expect a response and probably I am being a little too impatient. but I would like to just get a "yes" or "no" from you all whether result page navigation links are available at all? If not available, I would

Any patch for navigation of pages?

2007-08-21 Thread Naresh Saxena
hi friends, i am facing one problem with Nutch. I don't like the default search results page of Nutch. It does not have any navigation links like page 1, 2, 3, 4, etc. like that of Google, Yahoo, etc. is there any patch for it? cheers, Naresh

Re: Depth restriction on large crawls

2007-08-21 Thread Vince Filby
Matt, (I didn't notice your message until this morning...) I implemented this as well but I am using Nutch 0.8 so that the generated Lucene index version matches what our front end searcher is using. I implemented this in a manner similar to scoring. The first step was to add a crawl depth fiel

Re: How to submit patches?

2007-08-21 Thread Doğacan Güney
On 8/21/07, Smith Norton <[EMAIL PROTECTED]> wrote: > Thank you for the quick replies. If I do not wish to use svn and the > svn diff am I allowed to submit patches against Nutch 0.9 generated > using diff -au command? Of course you are allowed :). However, most developers use latest trunk so send

Re: How to submit patches?

2007-08-21 Thread Smith Norton
Thank you for the quick replies. If I do not wish to use svn and the svn diff am I allowed to submit patches against Nutch 0.9 generated using diff -au command? On 8/21/07, Doğacan Güney <[EMAIL PROTECTED]> wrote: > On 8/21/07, Smith Norton <[EMAIL PROTECTED]> wrote: > > Is there any guideline for

Re: How to submit patches?

2007-08-21 Thread Doğacan Güney
On 8/21/07, Smith Norton <[EMAIL PROTECTED]> wrote: > Is there any guideline for submission of Patches? Should the patch be > against the last stable version, i.e Nutch 0.9 or something else like > the latest code in the CVS? Nutch wiki has a good article: http://wiki.apache.org/nutch/HowToContrib

Re: How to submit patches?

2007-08-21 Thread Michael Wechner
Smith Norton wrote: Is there any guideline for submission of Patches? Should the patch be against the last stable version, i.e Nutch 0.9 or something else like the latest code in the CVS? IIRC Nutch is using SVN (http://subversion.tigris.org/faq.html) I also keep finding about 'trunk' ver

Re: Windows Share Crawling & searching

2007-08-21 Thread bikram
hi.. there was some problem in my config files and rectified it... but still getting the same error This is some part of the log saying that NOT INCLUDING CERTAIN PLUGINS INCLUDING Protocol-smb... 2007-08-20 10:15:28,891 DEBUG plugin.PluginRepository - parsing: /var/www/html/nutch9loc/plugins/

Re: Problem in creating Index

2007-08-21 Thread sachin_s
Hi, No itried with different search string but it returned 0 results. Also one more thing while indexing Local file system. I changed the nutch-site.xml as: plugin.includes protocol-file|protocol-http|parse-(text|html)|index-basic|query-(basic|site|url) h

How to submit patches?

2007-08-21 Thread Smith Norton
Is there any guideline for submission of Patches? Should the patch be against the last stable version, i.e Nutch 0.9 or something else like the latest code in the CVS? I also keep finding about 'trunk' very often. I am a newbie to open source way of doing projects, so it would be great if you coul

Re: Problem in creating Index

2007-08-21 Thread Susam Pal
You are searching for 'apache' in the search results. Are you sure the word 'apache' should exist in the search results? You can try some other string instead of 'apache' that you know would surely exist in one of the websites that you have crawled. There are a number of other things that could g

Re: Problem in creating Index

2007-08-21 Thread sachin_s
Ya Thanks, That solved my problem. However, while checking for the integrity of the indexes i execute the following command: bin/nutch org.apache.nutch.searcher.NutchBean apache but its returns me 0 Hits. Can u please tell me what i am missing? Thanks in Advance. Regards, Sachin. > You need to

Re: Problem in creating Index

2007-08-21 Thread Susam Pal
You need to set the following properties in 'conf/nutch-site.xml'. Though, in the example below, I have left the agent description, agent url, etc. void but ideally you should set them so that the owner of a website can find out who is crawling the site and how to reach them. http.agent.name