regex url filter

2005-06-30 Thread Emilijan Mirceski
If in my regex-urlfilter: >> # skip URLs containing certain characters as probable queries, etc. >> [EMAIL PROTECTED] i skip '?' and '=', I will have more pages in my database. Is there any strong reason why this was disabled in the release version? (My segments have about ~100 thousand pages

RE: recursion: see recursion

2005-06-30 Thread Emilijan Mirceski
Problem solved by an appropriate regex query. The reason for the problem is some strange combination of java code and urls. -Original Message- From: Emilijan Mirceski [mailto:[EMAIL PROTECTED] Sent: Thursday, June 30, 2005 3:40 PM To: nutch-user@lucene.apache.org Subject: recursion: see r

New build ?

2005-06-30 Thread Kashif Khadim
Hi, Just want to say that there is no new build for some days it will help if i can get the latest build. Thanks, Kashif __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

recursion: see recursion

2005-06-30 Thread Emilijan Mirceski
Lately, I'm receiving 1000's variations of the following: 050630 153456 fetching http://www.idividi.com.mk/vesti/makedonija/Politika/315216/mt.net.mk/mt.net. .k/mt.net.mk/mt.net.mk/mt.net.mk/mt.net.mk/mt.net.m k/mt.net.mk/mt.net.mk 050630 153457 Response content length is not known 050630 153458

Max Urls Per Server how to?

2005-06-30 Thread Ilia S. Yatsenko
I find this property fetcher.server.maxurls, but how and when it works?