Hi all,

I have been trying to fetch a query similar to:

http://www.xyz.com/?page=1

But where the number can vary from 1 to 100. Inside the first page
there are links to the next ones. So I updated the
conf/regex-urlfilter file and added:

^[0-9]{1,45}$

When I do this, the generate job fails saying that it is "Invalid
first character". I have tried generating with topN 5 and depth 5 and
trying to fetch more urls but that does not work.

Could anyone advise me on how to accomplish this? I am running Nutch 2.x.
Thanks in advance!


Renato M.

Reply via email to