Re: Nutch fetching skipped files

2008-04-04 Thread Susam Pal
My replies inline. On Fri, Apr 4, 2008 at 12:47 PM, Vineet Garg <[EMAIL PROTECTED]> wrote: > Hi > > Thanks for the response. Maybe I was not clear in expressing myself. > > I am crawling a parent directory in my 'home' on Linux machine therefore my > urls have to begin with file: and not http:.

Re: Nutch fetching skipped files

2008-04-04 Thread Vineet Garg
I have tried that but it does not work.. [EMAIL PROTECTED] wrote: Hello Vinet, Try using regex-urlfilter instead of crawl-urlfilter. Regards, Arkadi -Original Message- From: Vineet Garg [mailto:[EMAIL PROTECTED] Sent: Wednesday, April 02, 2008 10:34 PM To: nutch-user@lucene.apach

Re: Nutch fetching skipped files

2008-04-04 Thread Vineet Garg
Hi Thanks for the response. Maybe I was not clear in expressing myself. I am crawling a parent directory in my 'home' on Linux machine therefore my urls have to begin with file: and not http:. I have defined the file protocol and the crawl too is okay. My question is though I have modified the c

Re: Nutch fetching skipped files

2008-04-03 Thread Susam Pal
Find my reply inline. On Wed, Apr 2, 2008 at 5:04 PM, Vineet Garg <[EMAIL PROTECTED]> wrote: > Hi, > I am using Nutch to crawl local file system. I am crawling by bin/nutch > crawl urls -dir crawl -depth 5 -topN 500 > & crawl.log. > But nutch is fetching files e.g. .css or .png files which i ha

RE: Nutch fetching skipped files

2008-04-03 Thread Arkadi.Kosmynin
Hello Vinet, Try using regex-urlfilter instead of crawl-urlfilter. Regards, Arkadi > -Original Message- > From: Vineet Garg [mailto:[EMAIL PROTECTED] > Sent: Wednesday, April 02, 2008 10:34 PM > To: nutch-user@lucene.apache.org > Subject: Nutch fetching skipped files > > Hi, > I am usi