If in my regex-urlfilter:

>> # skip URLs containing certain characters as probable queries, etc.
>> [EMAIL PROTECTED]

i skip '?' and '=', I will have more pages in my database. 

Is there any strong reason why this was disabled in the release version? 
(My segments have about ~100 thousand pages total, which is barely 1.2 GB)

Regards,
Emilijan



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to